Cloudberry Backup – Affordable & Recommended Cloud Backup Service on Azure & AWS

Cloudberry Backup – Affordable & Recommended Cloud Backup Service on Azure & AWS

Cloudberry Backup – Affordable & Recommended Cloud Backup Service on Azure & AWS

Let me tell you about a CIO I knew from my days as a consultant. He was even-keeled most of the time and could handle just about anything that was thrown at him. There was just the one time I saw him lose control – when someone told him the data backups had failed the previous night. He got so flustered, it was as if his career was flashing before him, and the ship might sink and there were no remaining lifeboats.

If You Don’t Backup Data, What Are the Risks?

Backing up one’s data is a problem as old as computing itself. We’ve all experienced data loss at some point, along with the pain, time, and costs associated with recovering from the impacts caused in our personal or business lives. We get a backup to avoid these problems, insurance you hope you never have to use, but, as Murphy’s Law goes, if anything can go wrong, it will.

Data storage systems include various forms of redundancy and the cloud is no exception. Though there are multiple levels of data protection within cloud block and object storage subsystems, no amount of protection can cure all potential ills. Sometimes, the only cure is to recover from a backup copy that has not been corrupted, deleted, or otherwise tampered with.

SoftNAS provides additional layers of data protection atop cloud block and object storage, including storage snapshots, checksum data integrity verification on each data block, block replication to other nodes for high availability, and file replication to a disaster recovery node. But even storage snapshots rely upon the underlying cloud storage block and object storage, which can and does fail occasionally.

These cloud-native storage systems tout anywhere from 99.9% up to 11 nines of data durability. What does this really mean? It means there’s a non-zero probability that your data could be lost – it’s never 100%. So, when things do go wrong, you’d do best to have at least one viable backup copy. Otherwise, in addition to recovering from the data loss event, you risk losing your job too.

Why Companies Must Have a Data Backup

Let me illustrate this through an in-house experience.

In 2013, when SoftNAS was a fledgling startup, we had to make every dollar count and it was hard to justify paying for backup software or the storage it requires.

Back then, we ran QuickBooks for accounting. We also had a build server running Jenkins (still do), domain controllers, and many other development and test VMs running atop of VMware in our R&D lab. However, it was going to cost about $10,000 to license Veeam’s backup software and it just wasn’t a high enough priority to allocate the funds, so we skimped on our backups. Then, over one weekend, we upgraded our VSAN cluster.

Unfortunately, something went awry and we lost the entire VSAN cluster along with all our VMs and data. In addition, our makeshift backup strategy had not been working as expected and we hadn’t been paying close attention to it, so, in effect, we had no backup.

I describe the way we felt at the time as the “downtime tunnel”. It’s when your vision narrows and all you can see is the hole that you’re trying to dig yourself out of, and you’re overcome by the dread of having to give hourly updates to your boss, and their boss. It’s not a position you want to be in.

This is how we scrambled out of that hole. Fortunately, our accountant had a copy of the QuickBooks file, albeit one that was about 5 months old. And thankfully we still had an old-fashioned hardware-based Windows domain controller. So we didn’t lose our Windows domain. We had to painstakingly recreate our entire lab environment, along with rebuilding a new QuickBooks file by entering all the past 5 months of transactions and recreating our Jenkins build server. After many weeks of painstaking recovery, we managed to put Humpty Dumpty back together again.

Lessons from Our Data Loss

We learned the hard way that proper data backups are much less expensive than the alternatives. The week after the data loss occurred, I placed the order for Veeam Backup and Recovery. Our R&D lab has been fully backed up since that day. Our Jenkins build server is now also versioned and safely tucked away in a Git repository so it’s quickly recoverable.

Of course, since then we have also outsourced accounting and no longer require QuickBooks, but with a significantly larger R&D operation now we simply cannot afford another such event with no backups ever again. The backup software is the best $10K we’ve ever invested in our R&D lab. The value of this protection outstrips the cost of data loss any day.

Cloud Backup as a Service

Fortunately, there are some great options available today to back up your data to the cloud, too. And they cost less to acquire and operate than you may realize. For example, SoftNAS has tested and certified the CloudBerry Backup product for use with SoftNAS. CloudBerry Backup (CBB) is a cloud backup solution available for both Linux and Windows. We tested the CloudBerry Backup for Linux, Ultimate Edition, which installs and runs directly on SoftNAS. It can run on any SoftNAS Linux-based virtual appliance, atop AWS, Azure, and VMware. We have customers who prefer to run CBB on Windows and perform the backups over a CIFS share. Did I forget to mention this cloud backup solution is affordable at just $150, and not $10K?

CBB performs full and incremental file backups from the SoftNAS ZFS filesystems and stores the data in low-cost, highly-durable object storage – S3 on AWS, and Azure blobs on Azure.

CBB supports a broad range of backup repositories, so you can choose to back up to one or more targets, within the same cloud or across different clouds as needed for additional redundancy. It is even possible to back up your SoftNAS pool data deployed in Azure to AWS, and vice versa. Note that we generally recommend creating a VPC-to-S3 or VNET-to-Blob service endpoint in your respective public cloud architecture to optimize network storage traffic and speed up backup timeframes.

To reduce the costs of backup storage even further, you can define lifecycle policies within the Cloudberry UI that move the backups from object storage into archive storage. For example, on AWS, the initial backup is stored on S3, then a lifecycle policy (managed right in CBB) kicks in and moves the data out of S3 and into Glacier archive storage. This reduces the backup data costs to around $4/TB (or less in volume) per month. You can optionally add a Glacier Deep Archive policy and reduce storage costs even further down to $1 per TB/month. There is also an option to use AWS S3 Infrequent Access Storage.

There are similar capabilities available on Microsoft Azure that can be used to drive your data backup costs down to affordable levels. Bear in mind the current version of Cloudberry for Linux has no native Azure Blob lifecycle management integration. Those functions need to be performed via the Azure Portal.

Personally, I prefer to keep the latest version in S3 or Azure hot blob storage for immediate access and faster recovery, along with several archived copies for posterity. In some industries, you may have regulatory or contractual obligations to keep archive data much longer than with a typical archival policy.

Today, we also use CBB to back up our R&D lab’s Veeam backup repositories into the cloud as an additional DR measure. We use CBB for this because there are no excessive I/O costs when backing up into the cloud (Veeam performs a lot of synthetic merge and other I/O, which drives up I/O costs based on our testing).

In my book, there’s no excuse for not having file-level backups of every piece of important business data, given the costs and business impacts of the alternatives: downtime, lost time, overtime, stressful calls with the bosses, lost productivity, lost revenue, lost customers, brand and reputation impacts, and sometimes, lost jobs, lost promotion opportunities – it’s just too painful to consider what having no backup can devolve into.

To summarize, there are 5 levels of data protection available to secure your SoftNAS deployment:

1. ZFS scheduled snapshots – “point-in-time” meta-data recovery points on a per-volume basis
2. EBS / VM snapshots – snapshots of the Block Disks used in your SoftNAS pool
3. HA replicas – block replicated mirror copies updated once per minute
4. DR replica – file replica kept in a different region, just in case something catastrophic happens in your primary cloud datacenter
5. File System backups – Cloudberry or equivalent file-level backups to Blob or s3.

So, whether you choose to use CloudBerry Backup, Veeam®, native Cloud backup (ex. Azure Backup) or other vendor backup solutions, do yourself a big favor. Use *something* to ensure your data is always fully backed up, at the file level, and always recoverable no matter what shenanigans Murphy comes up with next. Trust me, you’ll be glad you did!

Disclaimer:

SoftNAS is not affiliated in any way with CloudBerry Lab. As a CloudBerry customer, we trust our business data to CloudBerry. We also trust our VMware Lab and its data to Veeam. As a cloud NAS vendor, we have tested with and certified CloudBerry Backup as compatible with SoftNAS products. Your mileage may vary.

This post is authored by Rick Braddy, co-founder, and CTO at Buurst SoftNAS. Rick has over 40 years of IT industry experience and contributed directly to formation of the cloud NAS market.

Choosing the Right Type of AWS Storage for your Use-Case: Block storage

Choosing the Right Type of AWS Storage for your Use-Case: Block storage

When choosing data storage, what do you look for? AWS offers several storage types and options, and each is better suited to a certain purpose than the others. For instance, if your business only needs to store data for compliance and with little need for access, Amazon S3 volumes are a good bet. For enterprise applications, Amazon EBS SSD-backed volumes offer a provisioned IOPs option to meet performance requirements.

And then there are concerns about the cost. Savings in cost usually come at the price of performance. However, the array of options from the AWS platform for storage means there usually is a type that achieves the balance of performance and cost that your business needs.

aws storage options

In this series, we are going to look at the defining features of each AWS storage type. We’re starting with block storage in this post, and by the end of the series, you should be able to tell which type of AWS storage sounds like the right fit for your business’ storage requirements.

AWS storage types – performance comparison

The first question that needs to be addressed is what is most important to your application or workload – throughput, latency or IOPs?

AWS storage types – performance comparison

The answer to this question will determine what type of storage will be needed to ensure a successful migration or launch. AWS has a number of storage options, and they break them down as: Block, File or Object storage. Block storage is comprised of SSD and HDD volumes, File storage contains managed services like EFS and FSx, and the Object storage category houses the S3 and Glacier variations.

AWS Block Storage

AWS Block Storage

EBS volum Types

Elastic Block Storage (EBS) is used by attaching the volume types to Amazon EC2 instances. It works well with applications that need low latency and consistent and predictable performance.

You may choose either EBS SSD volumes or EBS HDD volumes. SSD-backed volumes like General Purpose (gp2) and Provisioned IOPs (io2) have a good “burstability,” making them a good fit for applications that need a lot of read and write operations, like databases. Workloads with random read and write operations, low latency and high Input/Output Operations (IOPs) per second requirements are also suitable for SSDs.

If you opt io2-backed volumes, you can buy read/write operations on demand regardless of volume size. Provisioned SSD is designed to handle heavy workloads and with good performance.

HDD-backed volumes like st1 and sc1 are ideal for workloads with sequential I/O access and high throughput requirements at a low cost. Preferred use-cases for st1 and sc1 volumes include Hadoop, stream processing, splunk/log processing, data warehouses and media streaming.

EBS-optimized instances

To improve performance, consider EBS-optimized instances. The instance has a dedicated network line going to storage, instead of sharing network with users accessing the backend. It’s important to keep in mind, however, that while this dedicated network means there is better scope for the instance to function unhampered by network usage, the choice of instance or disk may limit performance. For instance, your network may allow for IOPs up to 8000, but if your disk caps IOPs at 6000, then that is the maximum you will get.

Designing for Data Storage in AWS

Once you identify the type of AWS storage best suited to your business needs, you should bear some things in mind about designing your data storage in the cloud.

• Look for predictable performance levels, something that can recreated, and not simply performance that is occurring due to bursts. Thus, you can plan with a degree of accuracy for future performance.

• Consider the difference between AWS-managed storage and customer-managed storage to determine which works for you. AWS-managed storage saves on cost but is also restricted to changes only when they are made at the programmatic level by Amazon, unlike customer-managed storage which may be tuned as and when your business needs change.

• Understand the concept of ‘burst credits’. As the size of the volume goes up, the threshold for credits falls. Choosing two small volumes to meet the medium-sized needs of your business does not work – the performance level of two small volumes does not match the capacity of one large volume.

• Last of all, don’t base your expectations on burst throughput. Burst levels are sustained for short durations and don’t offer consistent performance.

Buurst SoftNAS can assist with choosing the right storage for your business, and managing your file systems in the cloud. Get a free consult to learn more about the right way forward for your business. Learn about the free QuickStart program.

Best Practices Learned from 2,000 AWS VPC Configurations

Best Practices Learned from 2,000 AWS VPC Configurations

Best Practices Learned from 2,000 Amzon AWS VPC Configuration. Download the full slide deck on Slideshare

The SoftNAS engineering and support teams have configured over 2,000 Amazon Virtual Private Cloud (VPC) configurations for SMBs to Fortune 500 companies. In this guide, we share the lessons learned in configuring AWS VPCs, including undocumented guides, tips, and tricks.

Amazon’s Virtual Private Cloud enables you to launch Amazon Web Services (AWS) resources, like EC2 instances, into a virtual network that you’ve defined. They are flexible, secure and a core building block of AWS deployments.

 

In this Guide, we covered:

  • How do packets really flow in an Amazon VPC?
  • Common security group misconfigurations.
  • Why end-points are good things.
  • To NAT or not?
  • VPNs and VPCs: a good thing?
  • Best practices for AWS VPC management

We’ve configured over 2,000 Amazon AWS VPC

In this post, we’ll be talking about some of the lessons that we’ve learned from configuring over 2,000 VPCs for our customers on Amazon Web Services (AWS_). We’ve configured over 2,000 Amazon VPC for our customers and some of the customers that we’ve configured the VPCs for are listed out here.

configure Amazon AWS VPC

We’ve got a wide range of experience in both the SMB and the Fortune500 market. Companies like Nike, Boeing, Autodesk, ProQuest, and Raytheon, have all had their VPCs configured by SoftNAS.

Just to give you a brief overview of what we mean by SoftNAS. SoftNAS is the product that we use for helping manage storage on AWS. You can think of it as a software-defined NAS. Instead of having a physical NAS as you do on a traditional data center, our NAS is software-defined and it’s based fully on the cloud with no hardware required. It’s easy to use.

You can get up and running in under 30 minutes and it works with some of the most popular cloud computing platforms so Amazon, VMware, Azure, and CenturyLink Cloud.

What is an AWS VPC or a Virtual Private Cloud?

It can be broken down in a couple of different ways, and we’re going to break this down from how Amazon Web Service looks at this.

What is an AWS VPC or a Virtual Private Cloud

It’s a virtual network that’s dedicated to you. It’s essentially isolated from other environments in the AWS Cloud. Think of it as your own little mini data center inside of the AWS data center.

It’s actually a location where you launch resources into and allow you to virtually logically group them together for control. It gives you configuration flexibility to use your own private IP addresses, create different subnets, routing, whether you want to allow VPN access in, how you want to do internet access out, and configure different security settings from a security group. I’m an access control list point of view.

 The main things to look at that I see are around control.

  • What is your IP address range? How is the routing going to work?
  • Are you going to allow VPN access?
  • Is it going to be a hardware device at the other end?
  • Are you going to use Direct Connect? How are you going to architect your subnets?

These are all questions. I’m going to cover some of the tips and tricks that I have learned throughout the years. Hopefully, these will things that will help everyone because there is not really a great AWS VPC book out there or dot guidance. It’s just a smattering of different titbits and tricks from different people.

Security groups and ACL as well as some specific routing rules. There are some specific features that are available only in VPCs. You can configure multiple NIC interfaces.

You can set static private IPs so that you don’t ever lose that private IP when the machine is stopped and started, and in certain instances such as the T2s and the M4s, their primary purpose is to be lost within a VPC.

This is the way that you could perform your hybrid cloud setup or configuration. You could use Direct Connect, for example, to securely extend your premise location into the AWS Cloud, or you could use your VPN connection on the internet to also extend your premise location into the cloud.

You can peer the different VPCs together. You can actually use multiple different VPCs and peer them together for different organizational needs. You can also peer them together with other organizations for access to certain things — think of a backend supplier potentially for inventory control data.

Then there is a bunch of endpoint flow logs that help you with troubleshooting. Think of this, for those of you that have a Linux background or any type of networking background, like a TCP dump or a Wireshark ability to look at the packets and how they flow can be very useful when you’re trying to do some troubleshooting.

Amazon AWS VPC Topology Guidance 

Just some AWS VPC topology guidance so hopefully, you’ll come away will something useful here. Our VPC is used in a single region but will be in multiple availability zones.

AWS VPC Topology

It will extend across at least two zones because you’re going to have multiple subnets. Each subnet lives in a single availability zone. If you configure multiple subnets, you can configure this across multiple zones. You can take the default or you can dedicate a specific subnet to a specific zone.

All the local subnets can reach each other and route to each other by default. The subnet sizes can be from a /16 to a /28 and you can choose whatever your IP prefix is.

How can access traffic within the AWS VPC environment?

How can access traffic within the virtual private cloud environment? There is multiple difference between these gateways. What do these gateways mean and what do they do? Do you hear these acronyms IGW, VPG, and VGW? What does all this stuff do?

access traffic AWS VPC environment

These gateways generally are provisioned at the time of VPC creation, so keep that also in mind. The internet gateway is an ingress and egress for internet access.

You can essentially in your VPC point to specific machines or different routes to go out over the internet gateway to access resources outside of the VPC or you can restrict that and not allow that to happen. That’s all based on your organizational policy.

A virtual private gateway, which is the AWS side of a VPN connection, if you’re going to have VPN access to your VPC, this is the VPC on the AWS side of that connection and the CG is the customer side of the VPN connection within a specific VPC.

From a VPN option, you have multiple subsets. I mentioned Direct Connect which would essentially give you dedicated bandwidth to your VPC. If you wanted to extend your premise location up into the cloud, you could leverage Direct Connect for your high-bandwidth lower latency type of connections. Or if you wanted to just be able to make a connection faster and you didn’t necessarily need that level of your throughput or performance, you can just tap up a VPN channel.

Most VPN vendors like Cisco and others are supported and you can easily download a template configuration file for those major vendors directly.

Amazon Web Services (AWS) VPC Packets Flow

Let’s talk a little bit about how the packets flow within an AWS VPC. This is one of the things that I really wish I had known earlier on when I was first delving into configuring SoftNAS instances inside of VPCs and setting up VPCs for customers in their environments.

 AWS VPC Packets Flow

Its, there is not really great documentation out there on how packets get from point A to point B under specific circumstances. We’re going to back to this a couple of different times, but you’ve got to keep this in mind that we’ve got three instances here — instance A, B, and C — installed on three different subnets as you can see across the board.

How do these instances communicate with each other?

Let’s look at how instance A communicates to instance B. The packets hit the routing table. They hit the node table. They go outbound to the outbound firewall.

AWS VPC Packet Flow Instance A and Instance B

They hit the source and destination check that occurs and then the outbound security group is checked. Then the inbound security group source and destination check in the firewall.

This gives you an idea if you make different configuration changes in different areas, where they actually impact, and where they actually come into play. Let’s just talk about how instances would talk B to C.

Go back to the first diagram. We’ve already shown how A would communicate with B. How do we get over here to this other network? What does that actually look like from a packet flow perspective?

This is how it looks from an instance B perspective to try to talk to instance C, where it’s actually sitting on two subnets and the third instance (instance C) is on a completely different subnet.

AWS VPC Packet Flow Instance b and Instance c

It actually shows how these instances and how the packets would flow out to a completely different network and this would depend on which subnet each instance was configured in.

Amazon AWS VPC Configuration Guide

Some of the lessons that we’ve learned over time. These are personal lessons that I have learned and things that I wish if, on day one, somebody handed me a piece of paper. What would I want to have known going into setting up different VPCs and some of the mistakes that I’ve made throughout my time?

Organize AWS Environment

Number one is to tag all of your resources within AWS. If you’re not doing it today, go do it. It may seem trivial, but when you start to get into multiple machines, multiple subnets, and multiple VPCs, having everything tagged so that you can see it all in one shot really helps not make big mistakes even bigger.

Organize AWS Environment

 Plan your CIDR block very carefully. Once you set this VPC up, you can’t make it any bigger or smaller. That’s it, you’re stuck with it. Go a little bit bigger than you think you may need because everybody seems to really wish they hadn’t undersized the VPC, overall. Remember that AWS takes five IPs per subnet. They just take them away for their use. You don’t get them. Avoid overlapping CIDR blocks. It makes things difficult.

Save some room for future expansion, and remember, you can’t ever add any more. There are no more IPs once you set up the overall CIDR.

AWS Subnet Your Way to Success

Subnet Your Way to Success with aws vpc

Control the network properly. What I mean by that is use your ACLs, use the things in the security groups. Don’t be lazy with them. Control all those resources properly. We have a lot of resources and flexibility right there within the ACLs and the security groups to really lock down your environment.

Understand what your AWS subnet strategy is.

Is it going to be smaller networks, or you’re just going to hand out class Cs to everyone? How is that going to work?

If your AWS subnets aren’t associated with a very specific routing table, know that they are associated with the main routing table by default and only one routing table is the main. I can’t tell you how many times I thought I had configured a route properly but hadn’t actually assigned the subnet to the routing table and put the entry into the wrong routing table. Just something to keep in mind — some of these are little things that they don’t tell you.

I’ve seen a lot of people configure things and aligning their subnets to different tiers. They have the DMZ tier, sloe the proxy tier, and the subnet. They are subnets for load balancing, subnets for application, and subnets for databases. If you’re going to use RDS instances, you’re going to have to have at least three subnets, so keep that in mind.

Set your subnet permissions to “private by default” for everything. Use Elastic Load Balancers for filtering and monitoring frontend traffic. Use NAT to gain access to public networks. If you decide that you need to expand, remember the ability to peer your VPCs together.

Endpoint configuration

Also, Amazon has endpoints available for services that exist within AWS such as S3. I highly recommend that you leverage the endpoint’s capability within these VPCs, not only from a security perspective but from a performance perspective.

Understand that if you try to access S3 from inside of the VPC without an endpoint configured, it actually goes out to the internet before it comes back in so the traffic actually leaves the VPC. These endpoints allow you to actually go through the backend and not have to go back out to the internet to leverage the services that Amazon is actually offering.

Control Your Access

Control Your AWS Access

Do not set your default route to the internet gateway. This means that everybody is going to be able to get out. And in some of the defaulted wizard settings that Amazon offers, this is the configuration so keep it in mind. Everyone would have access to the internet.

You do use redundant NAT instances if you’re going to go with the instance mode, and there are some cloud formation templates that exist to make this really easy to deploy.

Always use IAM roles. It’s so much better than access keys. It’s so much better for access control. It’s very flexible. Here in the last 10 days, now you can actually attach an IAM role to a running instance, which is fantastic, and even easier to leverage now that you don’t have to deploy the new compute instances to attach and set IAM roles.

How does SoftNAS use Amazon VPC?

How does SoftNAS actually fit into using AWS VPC and why is this important?

softnas aws vpc high availability ha

We have a high-availability architecture leveraging our SNAP HA feature which provides high availability for failover cross zones, so multiple AZ high-availability. We leverage our own secure block replication using SnapReplicate to keep the nodes in sync, and we can provide a no downtime guarantee within Amazon if you deploy SoftNAS with the multiple AZ configuration in accordance with our best practices.

Cross-Zone HA: AWS Elastic IP

cross-zone HA AWS ELASTIC IP

This is how this looks and we actually offer two modes of high availability within AWS. The first is the Elastic IP-based mode where essentially two SoftNAS controllers can be deployed in a single region each of them into a separate zone.

They would be deployed in the public subnet of your VPC and they would be given elastic IP addresses and one of these elastic IPs would act as the VIP or the virtual IP to access both controllers. This would be particularly useful if you have on-premises resources, for example, or resources outside of the VPC that need to access this storage, but this is not the most commonly deployed use case.

Cross-Zone HA: Private Virtual IP Address

cross-zone HA private virtual IP address

Our private virtual IP address configuration is really the most common way that customers deploy the product today, and this is probably at this point about 85 to 90-plus percent of our deployments is in this cross-zone private approach, where you deploy the SoftNAS instance in the private subnet of your VPC.

It’s not sitting in the public subnet, and you pick any IP address that exists outside of the CIDR block of the VPC in order to be able to have high availability, and then you just mount your NFS clients or map your CIFS shares to that private virtual IP that exists outside of the subnet of the CIDR block for the overall VPC.

SoftNAS AWS VPC Common Mistakes

SoftNAS AWS VPC Common Mistakes

Some common mistakes that we see when people have attempted to deploy SoftNAS in a high availability architecture in VPC mode. You need to deploy two ENIs or Elastic Network Interfaces on each of the SoftNAS instances.

If you don’t catch this right away when you deploy it…Of course, these ENIs can be added to the instance after it’s already deployed, but it’s much easier just to go ahead and deploy the instances with the network interface attached.

Both of these NICs need to be in the same subnet. If you deploy an ENI, you need to make sure that both of them are in the same subnet. We do require ICMP to be open between the two zones as part of our health check.

We also see that the other problem is that people are providing access to S3. We actually as part of our HA provide a third-party witness and that third-party witness is an S3 bucket. So, therefore, we require access to S3, so that would require an endpoint or access out of your data infrastructure.

For Private HA mode, the VIP IP must not be within the CIDR of the VPC in order to overcome some of the networking limitations that exist within Amazon. Taran, I’m going to turn it back over to you. That concludes my portion of the presentation.

we suggest you have a look at our “AWS VPC Best Practices” blog post. In it, we share a detailed look at best practices for the configuration of an AWS VPC and common VPC configuration errors.

SoftNAS Overview

Softnas for AWS vpc best practices

Just to give everyone a brief overview of SoftNAS cloud. Basically, what we are is a Linux virtual appliance that’s available on the AWS marketplace. You are able to go to SoftNAS on AWS and spin up an instance and get up and running in about 30 minutes. As you can see in the image, our architecture is based on ZFS on Linux. We have an HTML5 GUI that’s very accessible and easy to use. We do work on a number of cloud platforms including AWS, as well as Amazon S3 and Amazon EBS.

AWS NAS Storage Solution

SoftNAS offers AWS customers an enterprise-ready NAS capable of managing your fast-growing data storage challenges including AWS Outpost availability. Dedicated features from SoftNAS deliver significant cost savings, high availability, lift and shift data migration, and a variety of security protection.

SoftNAS AWS NAS Storage Solution is designed to support a variety of market verticals, use cases, and workload types. Increasingly, SoftNAS NAS deployed on the AWS platform to enable block and file storage services through Common Internet File System (CIFS), Network File System (NFS), Apple File Protocol (AFP), and Internet Small Computer System Interface (SCSI). Watch the SoftNAS Demo.

How To Reduce Public Cloud Storage Costs

How To Reduce Public Cloud Storage Costs

How To Reduce Public Cloud Storage Costs. Download the full slide deck on Slideshare

John Bedrick, Sr. Director of Product Marketing Management and Solution Marketing, discussed how SoftNAS Cloud NAS  is helping to reduce Public Storage Costs. In this post, you will get a better understanding of the data growth trends and what needs to be considered when looking to make the move into the Public cloud.

How To Reduce Public Cloud Storage Costs

The amount of data that has been created by businesses is staggering – it’s on an order of doubling every 18 months. This is really an unsustainable long-term issue when we see what IT budgets are growing at compared to businesses.

IT budgets on average are growing maybe about 2 to 3% annually. Obviously, according to IDC, by 2020 which is not that far off, 80% of all corporate data growth is going to be unstructured — that’s your emails, PDF, Word Documents, images, etc. — while only about 10% is going to come in the form of structured data like databases, for example. And that could be SQL databases, NoSQL, XML, JSON, etc. Meanwhile, by 2020, we’re going to be reaching 163 Zettabytes worth of data at a pretty rapid rate.

How to Reduce Public Cloud Storage Costs

If you compel that by some brand new sources of data that we hadn’t really dealt with much in the past, it’s really going to be challenging for businesses to try to control and manage that when you add in things like the Internet of Things, big data analytics, all of which will create gaps between where the data is produced versus where it’s going to be consumed, analyzed, backed-up.

Really, if you look at things even from a consumer standpoint, almost everything we buy these days generates data that needs to be stored, controlled, and analyzed – from your smart home appliances, refrigerators, heating, and cooling, to the watch that you wear on your wrist, and other smart applications and devices.

If you compel that by some brand new sources of data that we hadn’t really dealt with much in the past, it’s really going to be challenging for businesses to try to control and manage that when you add in things like the Internet of Things, big data analytics, all of which will create gaps between where the data is produced versus where it’s going to be consumed, analyzed, backed-up.

Really, if you look at things even from a consumer standpoint, almost everything we buy these days generates data that needs to be stored, controlled, and analyzed – from your smart home appliances, refrigerators, heating, and cooling, to the watch that you wear on your wrist, and other smart applications and devices.

If you look at 2020, the number of people that will actually be connected will reach an all-time high of four billion and that’s quite a bit. We’re going to have over 25 million apps. We are going to have over 25 billion embedded and intelligent systems, and we’re going to reach 50 trillion gigabytes of data – staggering.

On the meantime, the data isn’t confined merely anymore to traditional data centers so the gap between where it’s stored and where it’s consumed and the preference for data storage is not going to be your traditional data center anymore.

Businesses are really going to be in a need of a multi-cloud strategy for controlling and managing this growing amount of data.

If you look at it, 80% of IT organizations will be committed to hybrid architectures and this is according to IDC. In another study from the “Voice of Enterprise” by the 451 Research Group, it was found that 60% of companies will actually be upgrading in a multi-cloud environment by the end of this year.

Data is created faster than the IT budgets grow

Data is created faster than the IT budgets grow

While data is being created faster and the rate of the IT budget is growing, you can see from the slide that there’s a huge gap, which leads to frustration from the IT organization.

Let’s transition to how do we address and solve some of these huge data monsters that are just gobbling up data as fast as it could be produced and creating a huge need for storage.

What do we look for in a public cloud solution to address this problem?

Well, some of these have been around for a little while.

Data storage compression.

Now, for those of you who haven’t been around the industry for very long, data storage compression basically removes the whitespace between and in data for more efficient storage.

Data storage compression.

If you compress the data that you’re storing, then you can get a net benefit of savings in your storage space and that, of course, immediately translates into cost savings. Now, all of this cost is subject to the types of data that you are storing.

Not all cloud solutions, by the way, include the ability to compress data. One example that comes to mind is a very well-promoted cloud platform vendor’s offering that doesn’t offer compression. Of course, I am speaking about Amazon’s Elastic File System, or EFS for short. An EFS does not offer compression. That means; you either need to have a third-party compression utility to compress your data prior to storing it in the cloud on EFS or solutions like EFS, and that can lead to all sorts of potential issues down the road. Or you need to store your data in an uncompressed format; and of course, if you do that, you’re paying unnecessarily more money for that cloud storage.

Deduplication

Another technology is referred to as deduplication. What is deduplication? Deduplication sounds and is exactly what it sounds like; it is the elimination or reduction of data redundancies.

Deduplication

If you look at all of the gigabytes, terabytes, and petabytes that you might have of data, there is going to be some level of duplication. Sometimes it’s a question of multiple people who may be even storing the exact same identical file on a system that gets backed up into the cloud. All of that is going to take up additional space.

If you’re able to deduplicate the data that you’re storing, you can achieve some significant storage-space savings, translation into cost savings, and that, of course, is subject to the amount of repetitive data being stored. Just like I mentioned previously with compression, not all solutions in the cloud include the ability to deduplicate data. Just as in the previous example that I had mentioned about Amazon’s EFS, EFS also does not include native deduplication.

Either, again, you’re going to need a third-party dedupe utility prior to storing it in EFS or some other similar solution, or you’re going to need to store all your data in an un-deduped format on the cloud. That means you’re, of course, going to be paying more money than you need to.

Object Storage

Much more cost-effective

Object Storage

Let’s just take a look at an example of two different types of storage at a high level. What you’ll take away from this image, I hope, is that you will see that object storage is going to be much more cost-effective, especially in the cloud.

Just a word of note. All the prices that I am displaying in this table are coming from the respective cloud platform vendors on the West Coast pricing. They offer different prices based on different locations and regions. In this table, I am using West Coast pricing. What you will see is that the more high-performing public cloud block storage costs are relatively more expensive than the lower-performing public cloud object storage.

In the example, you can see ratios of five or six or seven to one where object storage is less expensive than block storage. In the past, typically what people would use that object storage for would be to put less active storage and data into the object storage. Sort of more of a long curve strategy. You can think of it as maybe akin to the more legacy-type drives that are still being used today.

Of course, what people would do will be putting their more active data in block storage. If you follow that and you’re able to make use of object storage in a way that’s easy for your applications and your users to obtain too, then that works out great.

If you can’t…Most solutions out in the market today are unable to utilize access to cloud-native object storage and so they need something in between to be able to get the benefit of that. Similarly, being able to get cloud-native access to block storage also would require access to some solution for that and there are a few out in the market and, of course, SoftNAS is one of those.

High Availability With Single Storage Pool

Relies on The Robust Nature of Cloud Object Storage

High Availability With Single Storage Pool

 If you’re able to make use of object storage, what are some of the cool things you can do to save more money besides using object storage just by itself? A lot of applications require high availability. High availability (HA) is exactly what it sounds like. It is maintaining a maximum amount of up-time for applications and access to data.

There has been an ability to have two computing instances access a single storage pool — they both share access to the same storage pool –in the past, on legacy on-premises storage systems and it hasn’t been fully brought over into the public cloud until recently.

If you’re able to do this as this diagram shows — having two computer instances access an object-storage storage pool — that means you’re relying on the robust nature of public cloud object storage. The SLAs typically for public cloud object storage are at least 10 or more 9s of uptime. That would be 99.999999% or better, of up-time, which is pretty good.

The reason why you would have two computer instances is that the SLAs for the computing are not the same SLAs for the storage. You can have your compute instance go down in the cloud just like you could on an on-premises system; but at least your storage would remain up, using object storage. If you have two compute instances running in the cloud, if one of those — we’ll call it the primary node — was to fail, then the rollover or fell over would be to the second compute instance, or as I’m referring to on this diagram as the secondary node, and it would pick up.

There would be some amount of delay switching from the primary to the secondary. That will be a gap if you are actively writing to the storage during that period of time but then you would pick back up in a period of time — we’ll call it less than five minutes, for example — which is certainly better than being down for the complete duration until the public cloud vendor gets everything back up. Just remember that not every vendor offers this solution, but it can greatly reduce your overall public cloud storage cost by half. If you don’t need to have twice the storage for a fully high-available system and you can do it all with object storage in just two compute instances, you’re going to save roughly 50% of what the cost would normally be.

High-speed WAN optimization

Bulk data transfer acceleration

High-speed WAN optimization

The next area of savings is one that a lot of people don’t necessarily think about when they are thinking about savings in the cloud and that is your network connection and how to optimize that high-speed connection to get your data moved from one point to another.

Traditional ways of filling lots of physical hard drives or storage systems, then putting them on a truck and having that truck drive over to your cloud provider of choice. Then taking those storage devices and physically transferring the data from those storage devices into the cloud or possibly mounting it can be,, very expensive and filled with lots of hidden costs. Plus, you really do have to recognize that you run the risk of your data getting out of sync between the originating source in your data center and the ultimate cloud destination, all of which can cost you money.

Another option, of course, is I’ll lease some high-speed network connections between my data center or my data source and the cloud provider of your choice. That also could be very expensive. Needing a 1G network connection or a 10G network connection, those are pricy. Having the data transfer take longer than it needs to means that you have to keep paying for those leased lines, those high-speed network connections, longer than you would normally want.

The last option which would be transferring your data over slower more noisy error-prone network connections, especially in some parts of the world, is going to take longer due to the quality of your network connection and the inherent nature of the TCP/IP protocol. If it needs to have a retransmit of that data, sometimes because of those error conditions or drops or noise or latency, the process is going to become unreliable.

Sometimes the whole process of data transfer has to start over right from the beginning so all of the previous time is lost and you start from the beginning. All of that can result in a lot of time-consuming effort which is going to wind up costing your business money. All of those factors should also be considered.

Automated Storage Tiering of Aged Data

Match Application Data to Most Cost-effective Cloud Storage

Automated Storage Tiering of Aged Data

The next option I’m going to talk to you about is one that’s interesting. That is, assuming that you can make use of both object storage and block storage and be able to use them together. Creating tiers of storage where you are making use of the high-speed higher performing block storage on one tier and then also using other tiers which would be less performance and less expensive.

If you can have multiple tiers, where your most active data is only contained within the most expensive higher performing tier, then you are able to save money if you can move the data from tier to tier. A lot of solutions out in the market today are doing this via a manual process. Meaning that a person, typically somebody in IT, would be looking at the age of the data and moving it from one storage type to another storage type, to another storage type.

If you have the ability to create aging policies that can move the data from one tier to another tier, to another tier, and back again, as it’s being requested, that can also save you considerable money in two ways.

One way is, of course, you’re only storing and using the data on the tier of storage that is appropriate at the given time, so you’re saving money on your cloud storage. Also, if it could be automated, you’re saving money on the labor that would have to manually move the data from one tier to another tier. It can all be policy-driven so you’ll save money on the labor for that.

These are all areas in which you should consider looking at to help reduce your overall public cloud storage expense.

SoftNAS can offer in helping you save money in the public cloud.

Save 30-80% by reducing the amount of data to store  

SoftNAS provides enterprise-level cloud NAS Storage featuring data performance, security, high availability (HA), and support for the most extensive set of storage protocols in the industry: NFS, CIFS/SMB-AD, iSCSI. It provides unified storage designed and optimized for high performance, higher than normal I/O per second (IOPS), and data reliability and recoverability. It also increases storage efficiency through thin-provisioning, compression, and deduplication.

SoftNAS runs as a virtual machine, providing a broad range of software-defined capabilities, including data performance, cost management, availability, control, backup, and security.

Webinar: Consolidate Your Files in the Cloud

Webinar: Consolidate Your Files in the Cloud

Consolidate Your Files in the Cloud. You can download the full slide deck on Slideshare

Consolidating your file servers in AWS or Azure cloud can be a difficult and complicated task, but the rewards can outweigh the hassle.

Consolidate Your Files in the Azure or AWS Cloud

In this blog, we will deliver the housekeeping and an overview of what we’re talking about here will be file server storage consolidation pieces. What we want to talk about here today is, everyone is on this cloud journey and where we are on this cloud journey will vary from client to client, depending on an organization, and maybe even parts of the infrastructure or the applications that are moving over there.

The migration to the cloud is here and upon us. You’re not alone out there with it. We talk to many customers. SoftNAS has been around since 2012. We were born in the cloud and that’s where we cut our teeth and we’ve done our expertise with it. We’ve done approximately 3,000 AWS VPC. There’s not a whole lot in the cloud that we haven’t seen, but at the same time, I am sure we will see more and we do that every day with it here.

What you’re going to find out there right now as we talk to companies, we know that storage is always increasing. I was at a customer’s website, for a month ago, a major health care company. Their storage grows 25% every year so they double every refresh cycle out of it.

This whole idea of this capacity growth is there with us. They talk about 94% of workloads will be in the cloud. 80% of enterprise information in the form of unstructured data will be there.

A lot of the data analytics and IoT that you’re seeing now are all being structured in the cloud.

Four or five years ago, they were talking about cloud migration and about reducing CAPEX. Now, it’s become much more strategic. The areas that we solve with them are to try and understand.

 

“Where do we start?”

Where do we start

If I am an enterprise organization and I’m looking to move to the cloud, where do I start? Or maybe someone out there that is already there but looking to try to solve other issues with this.

The first use case we want to talk about as you saw in the title is around file server and storage consolidation. What we mean by that is that these enterprise organizations have these file servers and storage arrays, and I like to use the term that is rusting out.

What I mean by that is probably around three sections.

One, you can be coming up on a refresh cycle because your company is going on a normal refresh cycle due to lease or just due to the overall policy in the budget, how they refresh here.

Two, you could be coming up on the end of the initial three-year maintenance that you bought with that file server or that storage array. And, Getting ready for that fourth year, which if you’ve been around this game enough, you know that fourth and fifth year or fourth and subsequent years are always hefty renewals.

That might be coming up, or you may be getting to a stage here where the end of services or end-of-life (EOL) is happening with that particular hardware.

What we want to talk about here today and show you, how do you lift that data into the cloud and how we move that data to the cloud so I can start using it.

That way, when you go to this refresh, there is no need to go refresh those particular file servers and/or storage arrays, especially where you’ve got secondary and tertiary data where it’s mandatory to keep it around.

I’ve talked to clients where they’ve got so much data it would cost more to figure out what data they have, versus just keeping it. If you are in that situation out there, we can definitely help you with this.

The ability to go move that now, take that file server towards array, take that data, move it to the cloud, and use SoftNAS on it to be able to go access that data is what we are here to talk to you about today and how we can solve this.

This also can be solved in DR situations, even the overall data center situations too. Anytime you’re looking to go make a start to the cloud or if you’ve got this old gear sitting around.

You’re looking at the refresh and trying to figure out what do I do with it?

Definitely give us a ring here with that too.

As we talk about making the case for the cloud here, the secondary and tertiary data is probably one of the biggest problems that these customers deal with because it gets bigger and bigger.

It’s expensive to keep on-premise, and you got to migrate. Anytime you’re having a refresh or you buy new gear, you got to migrate this data. No matter what toolsets you have, you’ve got to migrate every time you go through a refresh.

Why not just migrate one to get it done with and just be able to just add more as needed with time?

Now, the cloud is much safer, easier to manage, and much more secure.

A lot of the missing knowledge we’ve had in the past about what’s going on in the cloud around security aspects has all been taken care of.

SoftNAS the only solution that makes the cloud perform as well as on-premise

Consolidate Your Files in the Cloud

What you’ll find here with SoftNAS and what makes us very unique is that our customers continue to tell us, “By using your product, I’m able to run at the same speed or see the same customer experience in the cloud as I do on-prem.” A lot of that is because we’re able to tune our product. A lot of it is because of the way we designed our product, and more importantly, are smart people behind this that can help you make sure that your experience in the cloud is going to be the same you have on-prem with it.

Or if you’re looking for something lesser than that, we can help you with that piece here too. What you’re going to see that we’re going to be offering around migration and movement of the data, speed, performance, and high availability, which is a must, especially if you’re running any type of critical applications or even if this data has to be highly-available with it too.

You’re going to see us scalable. We can move all the way up to 16 PB. We’ve got compliance so anyone around your security pieces would be happy to hear that because we take care of the security and the encryption with the data as well to make sure it works in a seamless fashion for you.

Virtual NAS Appliance

Tune to your performance and cost requirements

softnas virtual nas appliance

SoftNAS is a virtual NAS appliance. It’s a virtual appliance whether or not it’s in your cloud platform or if it’s sitting on-prem in your VMware environment. We are storage agnostic so we don’t care where that storage comes from. If you’re in Azure, from cool blob all the way to premium. If you are on-prem, whatever SSDs or hard drives that you have in your environment that’s connected to your VMware system, we also have the ability to do that.

As much as we are storage agnostic on the backend, on the front end, we’re also application agnostic. We don’t care what that application is. If that application is speaking general protocols such as CIFS, NFS, AFP, or iSCSI, we are going to allow you to address that storage.

And that gives you the benefit of being able to utilize backend storage without having to make API calls to that backend storage. It helps customers move to the cloud without them having to rewrite their application to be cloud-specific. Whether or not that’s AWS or Azure cloud. SoftNAS NAS Filer gives you that ability to have access to backend storage without the need of talking to APIs that are associated with that backed storage.

Tired Storage Accoress Multi-cloud Storage Types

Benefits that come out of utilizing SoftNAS as a frontend to your storage. Since I’ve been at this company, our customers have been asking us for one thing specific and that is can we get a tiering structure across multiple storage types?

Tired Storage Accoress Multi-cloud Storage Types

 This is regardless of my backend storage. I want my data to move as needed. We’re going at multiple environments and what we see for multiple environments is that probably about 10% to 20% of that data is considered hot data, with 90 to 80% of it being cool if not cold data.

Customers knowing their data lifecycle, allows them to eventually save money on their deployment. We have customers that come in and they say that “My data lifecycle is the first 30 to 60 days, it’s heavily hit. In the next 60 to 90 days, somebody will touch it or not touch it. Then the next 90 to 120 days, it needs to move to some type of archive care.” SoftNAS gives you the ability to do that by setting up smart tiers within that environment. Due to the aging policy associated with the blocks of data, it migrates that data down by tier as need be.

If you’re going through the process, after your first 30 to 60 days, the aging policy will move you down to tier two. If afterward that data is still not touched after 90 to 120 days, it will move you down to an archive tier, giving you the cost savings that are associated with being in that archive tier or being in that lower-cost tier two storage.

The benefit also is that, as much as you could migrate down these tiers, you could migrate back up these tiers. So you get into a scenario where you’re going through a review, and this is data that has not been touched for a year. However, you’re in the process of going through a review.

Whether it’s a tax review or it’s some other type of audit, what will happen is that as that data continues to be touched, it will first migrate. It will move from tier three back up to tier two. If that data continues to be touched, it will move back all the way up to tier one and it will start that live policy going all the way back down.

Your tier three could be multiple things. It could be object storage. It could be EBS magnetic or cold HDDs. Your tier, depending on the platform that you’re on, it could be EBS throughput optimized, GP2 magnetic, and vice versa.

Your hot tier could be GP2 provisioned IOPS, premium disk, or standard disk – it depends to what your performance needs would be.

High Availability (HA) Architecture

SoftNAS has patented high availability (HA) on all platforms that we support. Our HA is patented on VMware, it’s patented within Azure and also within AWS.

High Availability (HA) Architecture

What happens within that that environment is that there is a virtual IP that sits between two SoftNAS instances. The two SoftNAS instances have their data stores that are associated with the instance.

There is a heartbeat that goes between those two instances; the application is talking to the virtual IP between them. If there is an issue or a failure that occurs, what will happen is that your primary instance will shut down and that data will be moved to your secondary instance which is now turned back to your primary instance.

The benefit of that is that it’s a seamless transition for any kind of outage that you might have within your environment.

It’s also structured with AWS best practice, so that’s to have those instances be placed in an availability set or within different availability zones so that you have the ability to utilize the SLAs that are associated with the provider.

SoftNAS Disaster Recovery (DR) and High Availability (HA)

SoftNAS Disaster Recovery and High Availability (HA)

In this scenario, we have your HA setup and your HA setup is within the California region. Within the California region, you have the system set up SnapReplication and HA because that’s what’s needed within that region.

It allows you to go ahead and do failover in case there is any issue that happens to an instance itself. In utilizing the Azure environment by you having it within an availability set, what happens is that neither one of those instances will exist on the same hardware or the same infrastructure and it will allow you to do five 9s work of durability.

Within AWS, it’s structured so that you can actually do that by using the availability zone. By using availability zones, it gives you application durability and it is up to five 9s there, also within AWS. Up until a year ago, you could say that an availability zone or a region had never gone down for any one of the providers. But about a year ago and about a month apart from each other, AWS had a region go down and Azure also had a region go down. A customer came to us and asked for a solution around that.

The solution that we gave them was DR to a region that’s outside of that availability zone or that region altogether. That’s what it shows in this next picture. It’s that although you have SmapReplicate within that region, to be able to protect you, you also have DR replication that is centered entirely outside the zone to ensure that your data is protected in case a whole region fails.

Automated Live Data Migration with continuous sync

Automated Live Data Migration with continuous sync

The other thing that our customers have been asking for, as we come to their environments, they have tons of data. The goal is to move to the cloud. The goal is to move to the cloud either as quickly as possible or as seamlessly as possible.

They’ve asked us for multiple solutions to help them to get that data to the cloud. If your goal is to move your data as quickly as possible to the cloud — and we’re talking about Petabytes, hundreds of Terabytes of data — your best approach at that particular point in time is taking the FedEx route.

Whether it’s Azure Data Box or AWS Snowball, being able to utilize that to be able to send the data over to the cloud and then import that data into SoftNAS makes it easier for you to be able to manage the data.

That is going to be across cut, over. That means that at some point in time, on-prem, you’re going to have to stop that traffic and say that “This is what is considered enough. I’m going to send this over to the cloud, and then I’m going to populate and start over with my new data set in the cloud and have it run that way.”

If you’re looking for one cut-over, that’s what you’re going and we’re not talking about Petabytes worth of data. The way that we explain to customers and give customers ways of actually using it would be Lift & Shift. By you using SoftNAS and the Lift & Shift data migration capability of SoftNAS, what you could actually do is you could do a warm cut-over so data still running on your legacy storage servers is being copied over to the SoftNAS device in the cloud.

Then once that copy is complete, then you just roll over to the new instance existing in the cloud. SoftNAS has congestion algorithms that we use around UltraFAST that basically allow that data to go over highly-congested lines.

What we’ve seen within our testing and within different environments is that we could actually push data up to 20X faster by using UltraFAST across lines.

This is where it comes. You need to make that decision. Are you cold cutting that data to send it over to the cloud via Azure Data Box or Snowball, or is it a possibility for you to use Lift & Shift and do a warm cut-over to the SoftNAS device that you would have in the cloud?

 

SoftNAS Lift and Shift Data Migration

SoftNAS’s Cost-effective Lift and Shift data migration solution allow you to move data to the cloud for economies of scale and disaster recovery, all while reducing data storage expenses.

 

Lift & Shift data Migration

It’s a key feature of SoftNAS Cloud NAS, enabling users to migrate data from one platform to another, whether from on-premise to the cloud or between different cloud providers while maintaining continuous synchronization. SoftNAS® is a Cloud Data orchestration product, focusing on simplifying pain points within the marketplace.

As businesses continue to look for ways to increase efficiency and improve their bottom line, more and more look to the cloud. With increased flexibility, and the ability to cut out the high cost of hardware and hardware maintenance, the cloud is seen as the solution of the future, even though many are uncertain how to implement it.

Buurst hopes to make navigating the cloud a great deal easier, so that organizations can leverage the simplified business continuity strategies, and reduce infrastructure, maintenance, and service costs, without requiring advanced software and platform training.

Best Practices Learned from 1,000 AWS VPC Configurations

Best Practices Learned from 1,000 AWS VPC Configurations

AWS VPC Best Practices with SoftNAS 

Buurst SoftNAS has been available on AWS for the past eight years providing cloud storage management for thousands of petabytes of data. Our experience in design, configuration, and operation on AWS provides customers with immediate benefits.  

Best Practice Topics  

  • Organize Your AWS Environment
  • Create AWS VPC Subnets
  • SoftNAS Cloud NAS and AWS VPCs
  • Common AWS VPC Mistakes
  • SoftNAS Cloud NAS Overview
  • AWS VPC FAQs

Organize Your AWS Environment

Organizing your AWS environment is a critical step in maximizing your SoftNAS capability. Buurst recommends the use of tags. The addition of new instances, routing tables, and subnets can create confusion. The simple use of tags will assist in identifying issues during troubleshooting.  

When planning your CIDR (Classless Inter-Domain Routing) block, Buurst recommends making it larger than expected. This is because every VPC subnet created uses five IP addresses for the subnet. Thus, remember that all newly created subnets have a five IP overhead.  

Additionally, avoid using overlapping CIDR blocks as any future updates in pairing the VPC with another VPC will not function correctly with complicated VPC pairing solutions. Finally, there is no cost associated with a larger CIDR block, so simplify your scaling plans by choosing a larger block size upfront. 

Create AWS VPC Subnets

A best practice for AWS subnets is to align VPC subnets to as many different tiers as possible. For example, the DMZ/Proxy layer or the ELB layer uses load balancers, application, or database layer. If your subnet is not associated with a specific route table, then by default, it goes to the main route table. Missing subnet associations are a common issue where packets are not flowing correctly due to not associating subnets with route tables.  

Buurst recommends putting everything in a private subnet by default and using either ELB filtering or monitoring services in your public subnet. A NAT is preferred to gain access to the public network as it will become part of a dual NAT configuration for redundancy. Cloud formation templates are available to set up highly available NAT instances which require proper sizing based on the amount of traffic going through the network.      

Set up VPC peering to access other VPCs within the environment or from a customer or partner environment. Buurst recommends leveraging the endpoints for access to services like S3 instead of going out either over a NAT instance or an internet gateway to access services that don’t live within the specific VPCs. This setup is more efficient with a lower latency by leveraging an endpoint rather than an external link.  

Control Your Access

Control access within the AWS VPC by not cutting corners using a default route to the internet gateway. This access setup is a common problem many customers spend time on with our Support organization. Again, we encourage redundant NAT instances leveraging cloud formation templates available from Amazon or creating highly available redundant NAT instances 

The default NAT instance size is an m1.small, which may or may not suit your needs depending on the traffic volume in your environment. Buurst highly recommends using IAM (Identity and Access Management) for access control, especially configuring IAM roles to instances. Remember that IAM roles cannot be assigned to running instances and are set up during instance creation time. Using those IAM roles will allow you to not have to continue to populate AWS keys within the specific products to gain access to some of those API services. 

How Does SoftNAS Fit Into AWS VPCs?

Buurst SoftNAS offers a highly available architecture from a storage perspective, leveraging our SNAP HA capability, allowing us to provide high availability across multiple availability zones. SNAP HA offers 99.999% HA with two SoftNAS controllers replicating the data into block storage in both availability zones. Buurst customers who run in this environment qualify for our Buurst No Downtime Guarantee 

Additionally, AWS provides no SOA (Start of Authority) unless your solution runs in a multi-zone deployment.    

SoftNAS uses a private virtual IP address in which both SoftNAS instances live within a private subnet and are not accessible externally, unless configured with an external NAT, or AWS Direct Connect.

SoftNAS SNAP HA provides NFS, CIFS and iSCSI services via redundant storage controllers. One controller is active, while another is a standby controller. Block replication transmits only the changed data blocks from the source (primary) controller node to the target (secondary) controller. Data is maintained in a consistent state on both controllers using the ZFS copy-on-write filesystem, which ensures data integrity is maintained. In effect, this provides a near real-time backup of all production data (kept current within 1 to 2 minutes). 

A key component of SNAP HA™ is the HA Monitor. The HA Monitor runs on both nodes that are participating in SNAP HA™. On the secondary node, HA Monitor checks network connectivity, as well as the primary controller’s health and its ability to continue serving storage. Faults in network connectivity or storage services are detected within 10 seconds or less, and an automatic failover occurs, enabling the secondary controller to pick up and continue serving NAS storage requests, preventing any downtime.  

Once the failover process is triggered, either due to the HA Monitor (automatic failover) or as a result of a manual takeover action initiated by the admin user, NAS client requests for NFS, CIFS and iSCSI storage are quickly re-routed over the network to the secondary controller, which takes over as the new primary storage controller. Takeover on AWS can take up to 30 seconds, due to the time required for network routing configuration changes to take place. 

Common AWS VPC Mistakes

These are the most common support issues in AWS VPC configuration: 

  • Deployments require two NIC interfaces with both NICs in the same subnet. Double-check during configuration.  
  • SoftNAS health checks perform a ping between the two instances requiring the security group to be open at all times  
  • A virtual IP address must not be in the same CIDR of the AWS VPC. So, if the CIDR is 10.0.0.0/16, select a virtual IP address not within that subnet.  

SoftNAS Overview

Buurst SoftNAS is an enterprise virtual software NAS available for AWS, Azure, and VMware with industry-leading performance and availability at an affordable cost. 

SoftNAS is purpose-built to support SaaS applications and other performance-intensive solutions requiring more than standard cloud storage offerings.

  • Performance – Tune performance for exceptional data usage 
  • High Availability – From 3-9’s to 5-9’s HA with our No Downtime Guarantee 
  • Data Migration – Built-in “Lift and Shift” file transfer from on-premises to the cloud 
  • Platform Independent – SoftNAS operates on AWS,Azure, and VMware

Learn: SoftNAS on AWS Design & Configuration Guide 

    AWS VPC FAQs

    Common questions related to SoftNAS and AWS VPC: 

    We use VLANs in our data centers for isolation purposes today. What VPC construct do you recommend to replace VLANs in AWS?

    That would be subnets, so you could either leverage the use of subnets or if you really wanted to get a different isolation mechanism, create another VPC to isolate those resources further and then actually pair them together via the use of VPC pairing technology.

    You said to use IAM for access control, so what do you see in terms of IAM best practices for AWS VPC security?

    The most significant thing is that you deal with either thirdparty products or customized software that you made on your web server. Anything that requires the use of AWS API resources needs to use a secret key and an access key, so you can store that secret key and access key in some type of text file and have it reference it, or, b, the easier way is just to set the minimum level of permissions that you need in the IAM role, create this role and attach it to your instance and start time. Now, the role itself cannot be assigned, except during start time. However, the permissions of several roles can be modified on the fly. So you can add or subtract permissions should the need arise.

    So when you’re troubleshooting the complex VPC networks, what approaching tools have you found to be the most effective?

    We love to use traceroute.  I love to use ICMP when it’s available, but I also like to use the AWS Flow Logs which will actually allow me to see what’s going on in a much more granular basis, and also leveraging some tools like CloudTrail to make sure that I know what API calls were made by what user to understand what’s gone on.

    What do you recommend for VPN intrusion detection?

    There are a lot of them that are available. We’ve got some experience with Cisco and Juniper for things like VPN and Fortinet, whoever you have, and as far as IVS goes, Alert Logic is a popular solution. I see a lot of customers that use that particular product. Some people like some of the opensource tools like Snort and things like that as well.

    Any recommendations around secure junk box configurations within AWS VPC?

    If you’re going to deploy a lot of your resources within a private subnet and you’re not going to use a VPN, one of the ways that a lot of people do this is to just configure a quick junk box, and what I mean by that is just to take a server, whether it be a Windows or Linux, depending upon your preference, and put that in the public subnet and only allow access from a certain amount of IP addresses over to either SSH from a Linux perspective or RDP from a Windows perspective.  It puts you inside of the network and actually allows to gain access to the resources within the private subnet.

    And do junk boxes sometimes also work? Are people using VPNs to access the junk box too for added security?

    Some people do that. Sometimes they’ll just put like a junk box inside of the VPN and your VPN into that. It’s just a matter of your organization security policies.

    Any performance or further considerations when designing the VPC?

    It’s important to understand that each instance has its own available amount of resources, from not only from a network IO but from a storage IO perspective, and also it’s important to understand that 10GB, a 10GB instance, like let’s say take the c3.8xl which is a 10GB instance. That’s not 10GB worth of network bandwidth or 10GB worth of storage bandwidth. That’s 10GB for the instance, right? So if you have a high amount of IO that you’re pushing there from both a network and a storage perspective, that 10GB is shared, not only from the network but also to access the underlying EBS storage network. This confuses a lot of people, so it’s 10GB for the instance not just a 10GB network pipe that you have.

    Why would use an elastic IP instead of the virtual IP?

    What if you had some people that wanted to access this from outside of AWS? We do have some customers that primarily their servers and things are within AWS, but they want access to files that are running, that they’re not inside of the AWS VPC.  So you could leverage it that way, and this was the first way that we actually created HA to be honest because this was the only method at first that allowed us to share an IP address or work around some of the public cloud things like node layer to broadcast and things like that.

    Looks like this next question is around AWS VPC tagging. Any best practices for example?  

    Yeah, so I see people that basically take different services, like web and database or application, and they tag everything within the security groups and everything with that particular tag.  For people that are deploying SoftNAS, I would recommend just using the name SoftNAS as my tag. It’s really up to you, but I do suggest that you use them. It will make your life a lot easier.

    Is storage level encryption a feature of SoftNAS Cloud NAS or does the customer need to implement that on their own?  

    So as of our version that’s available today which is 3.3.3, on AWS you can leverage the underlying EBS encryption. We provide encryption for Amazon S3 as well, and coming in our next release which is due out at the end of the month we actually do offer encryption, so you can actually create encrypted storage pools which encrypt the underlying disk devices.

    Virtual VIP for HA: does the subnet this event would be part of add in to the AWS VPC routing table?

    It’s automatic. When you select that VIP address in the private subnet, it will automatically add a host route into the routing table. Which allows clients to route that traffic.

    Can you clarify the requirement on an HA pair with two next, that both have to be in the same subnet? 

    So each instance you need to move NIC ENIs, and each of those ENIs actually need to be in the same subnet.

    Do you have HA capability across regions? What options are available if you need to replicate data across regions? Is the data encryption at-rest, in-flight, etc.?

    We cannot do HA with automatic failover across regions.  However, we can do SnapReplicate across regions. Then you can do a manual failover should the need arise. The data you transfer via SnapReplicate is sent over SSH and across regions. You could replicate across data centers. You could even replicate across different cloud markets.

    Can AWS VPC pairings span across regions?

    The answer is, no, that it cannot.

    Can we create an HA endpoint to AWS for use with AWS Direct Connect?

    Absolutely. You could go ahead and create an HA pair of SoftNAS Cloud NAS, leverage direct connect from your data center, and access that highly available storage.

    When using S3 as a backend and a write cache, is it possible to read the file while it’s still in cache?

    The answer is, yes, it is. I’m assuming that you’re speaking about the eventual consistency challenges of the AWS standard region; with the manner in which we deal with S3 where we treat each bucket as its own hard drive, we do not have to deal with the S3 consistency challenges.

    Regarding subnets, the example where a host lives in two subnets, can you clarify both these subnets are in the same AZ?

    In the examples that I’ve used, each of these subnets is actually within its own VPC, assuming its own availabilities. So, again, each subnet is in its own separate availability zone, and if you want to discuss more, please feel free to reach out and we can discuss that.

    Is there a white paper on the website dealing with the proper engineering for SoftNAS Cloud NAS for our storage pools, EBS vs. S3, etc.?

    Click here to access the white paper, which is our SoftNAS architectural paper which was co-written by SoftNAS and Amazon Web Services for proper configuration settings, options, etc. We also have a pre-sales architectural team that can help you out with best practices, configurations, and those types of things from an AWS perspective. Please contact sales@softnas.com and someone will be in touch.

    How do you solve the HA and failover problem?

    We actually do a couple of different things here. When we have an automatic failover, one of the things that we do when we set up HA is we create an S3 bucket that has to act as a third-party witness. Before anything takes over as the master controller, it queries the S3 bucket and makes sure that it’s able to take over. The other thing that we do is after a take-over, the old source node is actually shut down.  You don’t want to have a situation where the node is flapping up and down and it’s kind of up but kind of not and it keeps trying to take over, so if there’s a take-over that occurs, whether it’s manual or automatic, the old source node in that particular configuration is shut down.  That information is logged, and we’re assuming that you’ll go out and investigate why the failover took place.  If there are questions about that in a production scenario, support@softnas.com is always available.