Three Technology Trends Helping to Revive the Oil and Gas Industry

Three Technology Trends Helping to Revive the Oil and Gas Industry

My career has somehow always aligned with oil and gas technology. From my time at Sun Microsystems to Microsoft to AWS and now Buurst. I can say that 2020 has been one of the most painful years. Starting with a price war between Saudi Arabia and Russia followed by a global slowdown from Covid-19 to the reduced cost of competing energy sources. Ouch! So how do we start the recovery? Here are three important technology trends to help Oil and Gas bounce back.

Trend One: The Cloud

We all know drilling dry holes is no longer acceptable. Leveraging geospatial application technology like Halliburton Decision Space 365® (Landmark), Schlumberger Petrel®, or  IHS Markit Kingdom® have become table stakes to ensuring you not only know the best place to drill, you know the right place to drill.

Most energy companies have invested $10s of millions of dollars into building world-class data centers dedicated to this work. These investments are essential and are of exact strategic value. But the costs keep going up, and you need more and more IT and security people to manage your investment (it’s a dangerous world out there). Worst of all, you get to replace this investment every three years if you want to stay competitive.  What do you do when you don’t have $20M to retool? Today there is an answer: Move the workloads to the cloud.

Why the cloud? You get the fastest and most up to date processing power without the need to buy the infrastructure. Moving to the cloud lets you move your investment from capital expense to an operating expense that you pay for by the hour, all backed and secured by companies like Microsoft (Azure) and Amazon (AWS). Moving to the cloud is happening today, and it’s happening fast. We see our Energy business grow more quickly this year than over the past seven years. There is a tipping point for the cloud and 2020.

Trend Two: Lift and Shift

Saving investment dollars, closing datacenter, and simplifying your IT footprint is a crucial goal of moving to the cloud. Companies often take two approaches. The first approach is to move specific workloads that are data and processing-intensive. Geospatial applications are great examples. The second approach is to focus on closing your datacenters. Many companies have thousands of applications in their data centers, and the prospect of moving these can be daunting. Migrating an application or an entire data center is commonly referred to as Lift & Shift. The good news is that 80% of these apps will go with no or very few issues. So now, what do you do with the 20% of the applications that are hard to move? If the goal is to close the data center, you can’t finish till all the applications are migrated. If the data center is still open, you cant achieve the cost savings. Your IT infrastructure will be more complicated, making the hard to move apps move is essential.   

Many companies go down the path of rewriting these hard to move applications, but it’s unnecessary. Mostly these apps don’t work in the cloud because they leverage protocols that are not supported in the cloud or are latency-sensitive. The number one most common protocol that is unsupported in the cloud is iSCSI. There are solutions here, and it’s essential to leverage them for these hard to move apps.  Buurst SoftNAS is a great example.

Legacy SQL Server workloads often fall into this category, and the countless instances across an enterprise could take all your DevOps resources years to rework. Don’t let your DevOps resources work on legacy workloads. These expensive and vital resources should be building the applications of the future. Leverage cloud storage solutions that support the protocols you’re using and move forward.  

Trend Three: Cross-platform business partnerships

So this trend is a little controversial but essential. If you ask any cloud vendor about a multi-cloud strategy, they will always tell you just to pick one, and make sure it’s them. There are some excellent reasons to choose one cloud, and it’s worth spending time to look before you make the decision. The trend we’re seeing is to pick a solution that works in different clouds, so moving will not require reengineering your infrastructure.

A VMware hypervisor is a great example. You probably use it in your data center today, so moving it is a straight forward effort. Storage is often overlooked and becomes the “lock-in” element of choice for cloud vendors. Making sure you know what makes a cloud sticky is essential. If you know going in, you can avoid making costly mistakes. Fortunately, there are many great partners like SoftwareONE, Kaskade.Cloud, CANCOM, VSTECS, or LANStatus, to name a few with cloud architects that can help you manage this part of your transition. 

BP recently said that oil demand may never rebound to pre-pandemic highs as the world shifts to renewables. I don’t know if that’s true, but for now, there is a clear need to rethink spending and change resources to take advantage of the learning from other industries. No one ever wants to be first to take the plunge. Fortunately, companies like Halliburton, Schlumberger, Petronas, IHS Markit, and ExxonMobil have all moved and are leveraging these strategies. Come on in, the water’s fine.

Do IOPS really matter?

Do IOPS really matter?

From the beginning of the Storage era, almost all storage vendors challenged each other to achieve the highest number of IOPS possible. There are a lot of historical storage datasheets showing IOPS as an only number and probably customers at that time only followed those numbers

Do IOPS really matter?

The short answer here is: a little bit”. It is a one factor of several other factors. After the data revolution a lot of things got changed. Now the source of data could be millions of devices in an IoT system, that means there are millions of systems that are trying to Read/Write simultaneously. The type of workloads dramatically varies especially in the presence of caching from write intensive media solutions like VDI solutions to read intensive in the database world. The time to reach the data become extremely important in several time-sensitive architectures like core banking. 

So now the huge numbers measured in millions is nothing to be proud of, so let us check what other factors we need to check before selecting or judging our storage 

How IOPS are measured and does that related to your workload? 

Storage vendors used to do their benchmarks in a way that helped them reach a higher number of IOPS, usually using few number of clients which might not be your use case, small block size such as 4k which might be much more lower than the one you need, random workloads where SSD speed grows 50% Read/Write which also might not be related to for example VDI or archiving workloads. Usually the reads are much faster than writes especially in RAID arrays. Such type of benchmarking will lead to a huge number of IOPS which might be not relevant to workloads that may need lower amount of IOPS, but more data written per each IOPS that may introduce a game-changing factor which is latency. 

Latency does matter!

Latency is a real critical factor, never accept those huge IOPS numbers without having a look at the latency figures. 

Latency is how long it takes for a single IOPS to happen, by increasing the workload the storage hardware including the controller, caching, RAM, CPU, etc will try to keep the latency consistent but things are not that ideal, at certain huge number of IOPS things will go out of control and the storage appliance hardware will get exhausted and more busy, so a delay serving the data will start getting noticed by the application and problems will start to happen. 

Databases for example are very latency sensitive workloads, usually they need small latency [5ms or lower] especially during writing otherwise there will be a huge performance degradation and business impact. 

So if your business is growing and you noticed degradation in your database performance, You don’t only need a storage with higher IOPS rate but with lower latency as well which leads us to another side point which is storage flexibility that Buurst can help you with. Just few steps you can upgrade your storage with whatever numbers that satisfies your workload 

IOPS/latency

Note:
Although the storage supports up to 10m IOPS, but it is almost not usable after 2m IOPS

 

How to get a storage that will work?

Generally speaking, any storage data sheet is not usually meant for you, but it can be somehow relevant and give you an idea about the general performance of the storage especially if it includes: 

1. Several benchmarks based on different block size, different read/write ratio and for both the sequential and random workload cases.

2. The number of clients used per each benchmark, the RAID type and the storage features [compression, deduplication etc].

3. The IOPS/latency charts for each of the above case, which is the most important thing. 

That is not all, if you are satisfied with those initial metrics, you are recommended to ask for a PoC to check how the storage works in your environment and in your specific case. 

Buurst will be happy to help you with the sizing and the PoC too with a trial license 

Data Loss Prevention

Data Loss Prevention

Through this post we will discuss more about data loss, which is the worst nightmare in the IT world, and how to protect ourselves in addition to how Buurst can help you keeping your data safe. 

Why we should care?

I believe the below numbers are enough to make us care: 

  • 93% of companies suffering from a catastrophic data loss do not survive – 43% never reopen and 51% close within two years. (University of Texas) 
  • 30% of all businesses that have a major fire go out of business within a year and 70% fail within five years. (Home Office Computing Magazine) 
  • 7 out of 10 small firms that experience a major data loss go out of business within a year. (DTI/Price Waterhouse Coopers) 
  • Every week 140,000 hard drives crash in the United States. (Mozy Online Backup)

%

of companies suffering from a catastrophic data loss do not survive

Know your enemy!

To know how to protect our data, of course we need to know what to protect it from. There is a wide range of events that can cause data loss. It might be Intentional, unintentional, due to failure, disaster or a crime. we can summarize them in the below points: 

  • Formatted disks/Deleted data that can happen due to human error or an application bug that may wipe out certain data 
  • Data corruption 
  • Catastrophic damage 
  • Corporate sabotage or an angry system admin that intentionally deleted all the data and even the backups on all sites (that happened) 
  • A hacker that gained a root privilege (this also happened). 
  • A virus, malware and ransomware 

No data or business is 100% safe, that is why you must have a backup strategy that can handle all these failures. But what strategy can handle all of that? 

There are several strategies depending on the budget and the criticality of data, one of the most common and somehow successful backup strategies is the 3-2-1 rule, that is acceptable and recommended by wide range of organizations including US-CERT [United States Computer Emergency Readiness Team]

What is a Backup? 

Before digging deeper into the 3–2-1 rule, let us first define what we meant by backup to avoid any misconception in the following sections: 

According to the Storage Networking Industry Association (SNIA):

a backup is a collection of data stored on (usually removable) non-volatile storage media for purposes of recovery in case the original copy of data is lost or becomes inaccessible – also called a backup copy

From between the lines, that means that a backup is an independent copy of the data i.e. stored on a different media. That is a very critical concept, and we will know why soon. 

The 3-2-1 backup rule 

1. Have at least 3 copies of your data
Three copies mean your original data that you are using plus two additional backups. Usually one copy in hand in case of any localized failure, you can restore it immediately

2. Keep these backups on 2 different media
These backups should be stored on two different media types or technologies since the same media type may have the same life span, and that is risky as you may lose both backups at the same time. The cloud can take care of that as your data is distributed on several medias by default

3. Store 1 backup offsite
This copy should be far away to be safe enough and survive any catastrophes like fires, earthquakes or wars that can remove a certain area from the map. I believe in the future this copy should be sent to another planet or even another solar system!

Backup Myths

Before proceeding to how Buurst can help you protecting your data based on the 3-2-1 rule, let us demolish two popular myths about backup:

1. I have RAID, I am Safe! 

That is a big misunderstanding for RAID, from its name it only cares about fault tolerance which a very different topic than backups which means according to SNIA: 

The ability of a system to continue to perform its function (possibly at a reduced performance level) when one or more of its components has failed. 

Backup is concerned about how to restore back any lost data through wide range of techniques, but it does not care about downtime as far as the data is safe and restorableOn the other hand, fault tolerance cares about business continuity in case of any failures. 

If you lose one disk, RAID is so important to keep your business going, as serving your first copy of data will keep going but it is not an independent copy of data, so it will never protect you from the other failures like data corruption or deletion. 

2. OK, I will take a snapshot

Snapshots are a great components in your backup strategy especially when it comes to replication, but it is not a backup by itself, as it does not create an independent copy, it just refers to data on the same disk, so it can only help restoring deleted data, but in case of data corruption or disks failures it cannot be used as a recovery medium

How Buurst can help you achieve the 3-2-1 backup rule? 

Snaprep, is a technology based on snapshots replicating between two nodes, the snapshot process has zero overhead on the performance and the storage space, it will be sent after compression to another independent node in another availability zone which is a different datacenter. 

Both nodes can have independent automatic snapshot schedule that can protect against data deletion. A SnapClone of any snapshots will allow you to serve/restore the data at the point in time it was taken 

The second node can be a redundant node and can serve the data in case of any failures and that will be discussed in a different article.  

So now we have two independent copies of the data, how about the third one?

You can use the second node as a backup source not to disturb the other node. You can integrate it with any backup solution you have, or you can use a third Buurst node [in a different region] to create a fully independent Disaster Recovery site by replicating the data to it using rsync or zfs send/receive etcThis will allow for a faster access of your data in case of an unforeseen failures which will eliminate time wastage when restoring from tapes (of course it is a time-budget trade off) 

So, by doing that we have achieved the 3-2-1 rule, by having 2 more copies of data one of them in a different region, but the question is: Is the 3-2-1 rule enough? 

Is the 3-2-1 rule enough? 

It will be sufficient in wide range of scenarios, but it will not protect against certain cases, your terminated backup admin got access to the three environments so he can easily remove everything including the snapshots and the DR site. A hacker with the same access can also do the same 

A new intelligent ransomware or virus that we never heard of can also affect all the data copies, and who knows, maybe it is smart enough to understand the snapshots and harm them too, that is why more backup models got introduced to mitigate such problems such as 3-2-2 and 3-2-3 that can be a discussion for another day 

Final thoughts 

There are a lot of data loss reasons and it will keep increasing. Humans are usually the biggest data threat by their intentional and unintentional activities. The race between attack and defense will keep going, so always review/update your risk management plan that will decide your backup strategy but try to avoid too much Paranoia! 

Sar, Elasticsearch, and Kibana

Sar, Elasticsearch, and Kibana

Kibana is a great visualization tool and this article shows how to automate building graphs and dashboards using API with sar logs as a data source.

Sar is an old, but good, sysadmin tool that helps answer many performance related questions…

Did we have a CPU spike yesterday at 2 pm when the customer complained?

Do we have enough RAM?

Do we have have enough IOPS with our brand new ssd disks?

Sar was a nice little tool that helped us collect statistics even without CloudWatch or SNMP or any other monitoring tool configured.

Well, sar has its issues. By default it collects statistics only once in 10 minutes and you will be deciphering the output like this:

01:00:01        CPU     %user     %nice   %system   %iowait    %steal     %idle
04:30:01        all      0.25      0.00      0.23     99.52      0.00      0.00
04:40:01        all      0.25      0.00      0.21     99.54      0.00      0.00
04:50:01        all      0.26      0.00      0.22     99.52      0.00      0.00
05:00:01        all      0.24      0.02      0.23     99.51      0.00      0.00
05:10:01        all      0.26      0.00      0.23     99.51      0.00      0.00
05:20:01        all      0.24      0.00      0.20     99.56      0.00      0.00
05:30:01        all      0.26      0.00      0.22     99.52      0.00      0.00
05:40:01        all      0.25      0.00      0.22     99.53      0.00      0.00
05:50:01        all      0.57      0.00      1.01     48.45      0.00     49.97
06:00:01        all      0.32      0.00      0.41     10.32      0.00     88.95
06:10:01        all      0.24      0.00      0.19      0.33      0.00     99.25
06:20:01        all      0.23      0.00      0.18      0.35      0.00     99.24
06:30:01        all      0.24      0.00      0.17      0.32      0.00     99.27
06:40:01        all      0.24      0.00      0.19      0.36      0.00     99.21
06:50:01        all      0.46      0.00      1.00     25.55      0.00     72.99
07:00:01        all      1.26      0.00      3.52     90.35      0.00      4.87
07:10:01        all      1.26      0.00      4.01     90.57      0.00      4.16
07:20:01        all      1.07      0.00      3.56     89.42      0.00      5.95

This is actually a good example that shows some event possibly requiring further investigation. The server was clearly stuck on IO subsystem as the %iowait column shows it was more than 99%. At 05:50 it suddenly became better, iowait dropped to nearly zero and overall CPU usage was less than 0.5%. Surely something was going on!

Elasticsearch is a much more sophisticated technology. Elasticsearch is a distributed search and analytics engine, but when we really speak of Elasticsearch, we are speaking of a bunch of interconnected products commonly known as Elastic Stack:

Beans – many small agents to upload data to Elasticsearch.

Logstash – accepts data from the Beans, and after potentially complicated processing, uploads the transformed data into Elasticsearch.

Elasticsearch – the search and analytics engine and the heart of the Elastic Stack.

Kibana – a great visualization tool and a graphical interface into Elasticsearch.

Elastic (ELK) Stack Architecture

So, these capital letters comprise what used to be called an ELK stack – E from Elasticsearch, L from Logstash, and K from Kibana. These days we tend to include Beans into the Stack and call it Elastic Stack.

Performing virtual appliances health checks, our team often needs to analyze log sets from different customers on a regular basis. The logs contain tons of valuable information so why not feed it to Elasticsearch and see what happens!

Naturally, log files that we check most often have been sent to ElasticSearch using one of the beats – like the Filebeat – so we could visually explore the logs in Kibana pretty much instantaneously. Keeping the logs centrally is a good practice and ways to do it are really countless. Rsyslog, Splunk, Loggly, CloudWatch Logs are popular central log solutions and Elasticsearch fits really well in this family.

Sar logs are a usual part of the log sets to be analyzed but there is sometimes a tiny inconvenience with sar logs. They are often generated by older sar versions, and there are 2 problems with that:

1. The current sar does not understand the old version logs, and the old sar version needs to be installed just to process the sar logs.

2. The graphs can’t be easily produced due to the limitations of the old versions.

The backward compatibility of sars logs is out of our hands, and some practice and automation does not make the old sar version installation too much of a problem. At the same time, analyzing sar logs for many days and checking many parameters demands some graphical data presentation. For example, a current sar on Ubuntu allows these commands to run:

sadf -g > cpu.svg
sadf -g -- -r > ram.svg

See these graphs in your favourite browser or image viewer:

The older sar versions simply don’t have an option to produce graphics. Still, sar logs are well structured and Elasticsearch is a powerful tool to process logs in 2 easy steps:

1. Load sar data into Elasticsearch.

2. Use Kibana to do all the visualizations and dashboards based on the data in Elasticsearch.

So how do we do it automatically? By all means, there are many logs and we don’t want to do it manually after proof of the concept!

The answer is API and bash. We occasionally thought of writing API calls using Python or other full-featured language but bash proved to be more than enough for most cases.

We used 2 absolutely different APIs to do the task – the first API was Elasticsearch to load data and the second API was Kibana to create all the graphs and dashboards.

We have found that the Kibana API is less documented and we feel that more examples would benefit the community. As such, we provide all the API calls examples here. Each API call is a curl command referring to a json file. We shall provide both the curl command and the example json file for all the calls.

We have also utilized the Kibana concept of spaces to distinguish between logs from different servers. One space is only for one server. Ten servers means ten Kibana spaces. Using spaces greatly reduces the risk of processing data for the wrong server.

Depending on which metric we process in the loop, we used the following commands on the sar log referred as $file below.

for CPU:

sadf -d `echo $file`

for RAM:

sadf -d `echo $file` -- -r

for swap:

sadf -d `echo $file` -- -S

for IO:

sadf -d `echo $file` -- -b

for disks:

sadf -d $file -- -d -p

for network:

sadf -d $file -- -n DEV

Once we have output from one of the above commands or whatever other command we want to process further and vizualize, it’s time to create the indexes in ElasticSearch. Indexes are required so there is a place where we can upload sar data. For example, the index for CPU data is created this way:


curl -XPUT -H'Content-Type:application/json' $ELASTIC_HOST:9200/sar.$METRIC.$HOSTNAME?pretty -d @create_index_$METRIC.json
 
$ cat create_index_cpu.json

{
  "mappings": {
    "properties": {
      "hostname":    { "type": "keyword" }, 
      "interval":  { "type": "integer"  },
      "timestamp":   {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss zzz"
      },
      "CPU":    { "type": "integer" }, 
      "%user":  { "type": "float"  },
      "%nice":   { "type": "float"  },
      "%system":    { "type": "float" }, 
      "%iowait":  { "type": "float"  },
      "%steal":   { "type": "float"  },
      "%idle":   { "type": "float"  }
    }
  }
}

Once the indexes for all the metrics are created, it’s time to upload sar data into Elasticsearch indexes. Bulk upload is the easiest way and below is an example json file for swap sar data:


curl -H 'Content-Type: application/x-ndjson' -XPOST $ELASTIC_HOST:9200/_bulk?pretty --data-binary @interim.json

$ more interim.jso

{"index": {"_index": "sar.swap.server1.example.com "}}
{"hostname":"# hostname","interval":"interval","timestamp":"timestamp","kbswpfree":"kbswpfree"
,"kbswpused":"kbswpused","%swpused":"%swpused","kbswpcad":"kbswpcad","%swpcad":"%swpcad"}
{"index": {"_index": "sar.server1.example.com "}}
{"hostname":"SoftNAS-A83PR","interval":"595","timestamp":"2020-06-01 05:10:01 UTC","kbswpfree"
:"0","kbswpused":"4128764","%swpused":"100.00","kbswpcad":"23324","%swpcad":"0.56"}
{"index": {"_index": "server1.example.com"}}
{"hostname":"SoftNAS-A83PR","interval":"595","timestamp":"2020-06-01 05:20:01 UTC","kbswpfree"
:"0","kbswpused":"4128764","%swpused":"100.00","kbswpcad":"23324","%swpcad":"0.56"}
{"index": {"_index": "server1.example.com"}}
{"hostname":"SoftNAS-A83PR","interval":"595","timestamp":"2020-06-01 05:30:01 UTC","kbswpfree"
:"0","kbswpused":"4128764","%swpused":"100.00","kbswpcad":"23324","%swpcad":"0.56"}

All Elasticsearch work is done now. Data is uploaded to Elasticsearch indexes and we are switching to Kibana to create a few nice graphs.

First, we change the Kibana time format and Kibana time settings to how we like them.

The settings could be found in advanced settings in the Kibana UI but it’s easy to forget for any new Kibana installations:


curl -X POST -H "Content-Type: application/json" -H "kbn-xsrf: true" -d @change_time_format.json  http://$KIBANA_HOST:5601/s/$SPACE_ID/api/kibana/settings

curl -X POST -H "Content-Type: application/json" -H "kbn-xsrf: true" -d @change_time_zone.json  http://$KIBANA_HOST:5601/s/$SPACE_ID/api/kibana/settings

$ cat change_time_format.json 
{"changes":{"dateFormat:scaled":"[\n  [\"\", \"HH:mm:ss.SSS\"],\n  [\"PT1S\", \"HH:mm:ss\"],\n  [\"PT1M\", \"MM-DD HH:mm\"],\n  [\"PT1H\", \"YYYY-MM-DD HH:mm\"],\n  [\"P1DT\", \"YYYY-MM-DD\"],\n  [\"P1YT\", \"YYYY\"]\n]"}}

$ cat change_time_zone.json 
{
  "changes":{
    "dateFormat:tz":"Etc/GMT+5"
  }
}

Lets create a Kibana space for each server

The screenshot shows the space selector page, where we choose to keep using the default space or choose one of the server spaces created with the api call above.

curl -X POST -H "Content-Type: application/json" -H "kbn-xsrf: true" -d @interim.json  http://$KIBANA_HOST:5601/api/spaces/space


$ cat interim.json 
{
  "id": "server1.example.com",
  "name": "server1.example.com"
}

Now, the real Kibana work – create index patterns. The example shows json file for swap data:

curl -X POST -H "Content-Type: application/json" -H "kbn-xsrf: true" -d @interim.json  http://$KIBANA_HOST:5601/s/$SPACE_ID/api/saved_objects/index-pattern

$ cat interim.json 
{
  "attributes":
    {
      "title": "sar.swap.server1.example.com *",
      "fields": "[{\"name\":\"kbswpfree\",\"type\":\"number\",\"esTypes\":[\"float\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"name\":\"kbswpused\",\"type\":\"number\",\"esTypes\":[\"float\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"name\":\"%swpused\",\"type\":\"number\",\"esTypes\":[\"float\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"name\":\"kbswpcad\",\"type\":\"number\",\"esTypes\":[\"float\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"name\":\"%swpcad\",\"type\":\"number\",\"esTypes\":[\"float\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"name\":\"swap\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"name\":\"_id\",\"type\":\"string\",\"esTypes\":[\"_id\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"name\":\"_index\",\"type\":\"string\",\"esTypes\":[\"_index\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"name\":\"_score\",\"type\":\"number\",\"count\":0,\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"_source\",\"type\":\"_source\",\"esTypes\":[\"_source\"],\"count\":0,\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"_type\",\"type\":\"string\",\"esTypes\":[\"_type\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"name\":\"hostname\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"name\":\"interval\",\"type\":\"number\",\"esTypes\":[\"integer\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true},{\"name\":\"timestamp\",\"type\":\"date\",\"esTypes\":[\"date\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true}]"
    }
}

Create graphs, which are called visualizations in Kibana. The json file below is for one of the CPU graphs:


curl -X POST -H "Content-Type: application/json" -H "kbn-xsrf: true" -d @$METRIC.$HOSTNAME.$i.json http://$KIBANA_HOST:5601/s/$SPACE_ID/api/saved_objects/visualization


$ cat cpu.server1.example.com.%user.json
{
  "attributes":
    {
      "title": "sar-cpu-server1.example.com-%user",
      "visState": "{\"title\":\"%user\",\"type\":\"line\",\"params\":{\"type\":\"line\",\"grid\":{\"categoryLines\":false},\"categoryAxes\":[{\"id\":\"CategoryAxis-1\",\"type\":\"category\",\"position\":\"bottom\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\"},\"labels\":{\"show\":true,\"filter\":true,\"truncate\":100},\"title\":{}}],\"valueAxes\":[{\"id\":\"ValueAxis-1\",\"name\":\"LeftAxis-1\",\"type\":\"value\",\"position\":\"left\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\",\"mode\":\"normal\"},\"labels\":{\"show\":true,\"rotate\":0,\"filter\":false,\"truncate\":100},\"title\":{\"text\":\"Max %user\"}}],\"seriesParams\":[{\"show\":true,\"type\":\"line\",\"mode\":\"normal\",\"data\":{\"label\":\"%user\",\"id\":\"1\"},\"valueAxis\":\"ValueAxis-1\",\"drawLinesBetweenPoints\":true,\"lineWidth\":2,\"interpolate\":\"linear\",\"showCircles\":true}],\"addTooltip\":true,\"addLegend\":false,\"legendPosition\":\"right\",\"times\":[],\"addTimeMarker\":false,\"labels\":{},\"thresholdLine\":{\"show\":false,\"value\":10,\"width\":1,\"style\":\"full\",\"color\":\"#34130C\"},\"dimensions\":{\"x\":null,\"y\":[{\"accessor\":0,\"format\":{\"id\":\"number\"},\"params\":{},\"aggType\":\"count\"}]}},\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"max\",\"schema\":\"metric\",\"params\":{\"field\":\"%user\"}},{\"id\":\"2\",\"enabled\":true,\"type\":\"date_histogram\",\"schema\":\"segment\",\"params\":{\"field\":\"timestamp\",\"useNormalizedEsInterval\":true,\"scaleMetricValues\":false,\"interval\":\"10m\",\"drop_partials\":false,\"min_doc_count\":1,\"extended_bounds\":{}}}]}",
      "uiStateJSON": "{}",
      "description": "",
      "version": 1,
      "kibanaSavedObjectMeta": {
        "searchSourceJSON": "{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[],\"indexRefName\":\"kibanaSavedObjectMeta.searchSourceJSON.index\"}"
      }
    },
  "references": [
      {
        "name": "kibanaSavedObjectMeta.searchSourceJSON.index",
        "type": "index-pattern",
        "id": "2a5ed4b0-b451-11ea-a8db-210d095de476"
      }
    ]

}

We are pretty much done but we could have generated dozens of graphs by now, so lets make a few dashboards to organize graphs by metric, meaning one dashboard for CPU, one for RAM, one for each disks, etc:


curl -X POST -H "Content-Type: application/json" -H "kbn-xsrf: true" -d @$INTERIM_FILE http://$KIBANA_HOST:5601/s/$SPACE_ID/api/saved_objects/dashboard

{
  "attributes":
    {
      "title": "sar-swap-server1.example.com",
      "hits": 0,
      "description": "",
      "panelsJSON": "[{\"version\":\"7.5.1\",\"gridData\":{\"w\":12,\"h\":8,\"x\":0,\"y\":0,\"i\":\"sar-swap-softnas-a83pr-kbswpfree\"},\"panelIndex\":\"sar-swap-softnas-a83pr-kbswpfree\",\"embeddableConfig\":{},\"panelRefName\":\"panel_0\"},{\"version\":\"7.5.1\",\"gridData\":{\"w\":12,\"h\":8,\"x\":12,\"y\":0,\"i\":\"sar-swap-softnas-a83pr-kbswpused\"},\"panelIndex\":\"sar-swap-softnas-a83pr-kbswpused\",\"embeddableConfig\":{},\"panelRefName\":\"panel_1\"},{\"version\":\"7.5.1\",\"gridData\":{\"w\":12,\"h\":8,\"x\":24,\"y\":0,\"i\":\"sar-swap-softnas-a83pr-%swpused\"},\"panelIndex\":\"sar-swap-softnas-a83pr-%swpused\",\"embeddableConfig\":{},\"panelRefName\":\"panel_2\"},{\"version\":\"7.5.1\",\"gridData\":{\"w\":12,\"h\":8,\"x\":36,\"y\":0,\"i\":\"sar-swap-softnas-a83pr-kbswpcad\"},\"panelIndex\":\"sar-swap-softnas-a83pr-kbswpcad\",\"embeddableConfig\":{},\"panelRefName\":\"panel_3\"},{\"version\":\"7.5.1\",\"gridData\":{\"w\":12,\"h\":8,\"x\":48,\"y\":0,\"i\":\"sar-swap-softnas-a83pr-%swpcad\"},\"panelIndex\":\"sar-swap-softnas-a83pr-%swpcad\",\"embeddableConfig\":{},\"panelRefName\":\"panel_4\"}]",
      "optionsJSON": "{\"useMargins\":true,\"hidePanelTitles\":false}",
      "version": 1,
      "timeRestore": false,
      "kibanaSavedObjectMeta": {
        "searchSourceJSON": "{\"query\":{\"query\":\"\",\"language\":\"kuery\"},\"filter\":[]}"
      }


    },
    "references": [

      {
        "name": "panel_0",
        "type": "visualization",
        "id": "56224aa0-b451-11ea-a8db-210d095de476"
      },
      {
        "name": "panel_1",
        "type": "visualization",
        "id": "56b95a80-b451-11ea-a8db-210d095de476"
      },
      {
        "name": "panel_2",
        "type": "visualization",
        "id": "5752db60-b451-11ea-a8db-210d095de476"
      },
      {
        "name": "panel_3",
        "type": "visualization",
        "id": "57ec5c40-b451-11ea-a8db-210d095de476"
      },
      {
        "name": "panel_4",
        "type": "visualization",
        "id": "58865250-b451-11ea-a8db-210d095de476"
      }
    ]

}

Json files often look scary, but they are not actually. Once the desired object is created manually in Kibana UI, the json could be found and copy-and-paste is easily applied with only a minor editing or auto replacement.

Just a few more API calls are required while coding all the visualizations and dashboards:

Get index pattern id:

curl -X GET -H "Content-Type: application/json" -H "kbn-xsrf: true" http://$KIBANA_HOST:5601/s/$SPACE_ID/api/saved_objects/_find?type=index-pattern&fields=title

Get visualization id:

curl -X GET -H "Content-Type: application/json" -H "kbn-xsrf: true" "http://$KIBANA_HOST:5601/s/$SPACE_ID/api/saved_objects/_find?type=visualization&per_page=1000"

 Lets enjoy the newly created dashboards!

 The CPU dashboard shows a spike related to a massive data copy operation:

The RAM dashboard shows the same data copy operation from a memory consumption point of view:

The root disk dashboard:

The data disk dashboard. The server has 4 data disks in RAID 0 and the dashboard shows metrics for one the data disks:

Webinar Recap – The Three Strategies to Increase Performance for Your Applications in AWS

Webinar Recap – The Three Strategies to Increase Performance for Your Applications in AWS

The above is a recording and follows is a full transcript from the webinar, “3 strategies to increase performance for your applications in AWS.” You can download the full slide deck on Slideshare.

Hi, this is Jeff Johnson. I’m head of product marketing at Buurst. This is a recap of a webinar that we did — it’s called The Three Strategies to Increase Performance for Your Applications in AWS. 

Many customers have tried building these applications in their data centers and they are designed for extremely fast storage. As they are migrating their line of business applications to the cloud, a lot of their applications worked and didn’t have a constraint on access to the storage.  

But a few of those applications would not run or will not run in the cloud with acceptable performance because of the demand, the type of application that it is, and the connections or the throughput or the latency to get the data off of the storage and back into the application or put it back on to the storage. 

They weren’t getting the performance that they needed, so they tried using AWS EFS. They tried using FSx. They tried using maybe some open-source NAS and they weren’t getting the performance until they discovered SoftNAS. 

SoftNAS then had the appropriate levers to pull, buttons to push to make the application run with great performance, and we’re going to do a deep dive into why and we have some statistics on some performance benchmarking we did that prove out some of these things. 

Companies trust Buurst for data performance, data migration, data cost control, high availability, control, and security. When we think about this scenario of performance, there’s two different camps, let’s say. There are managed storage services, and they are great because someone else is managing your storage layer for you. Then there’s a cloud NAS network attached storage like SoftNAS, so this sits in between your cloud clients and your on-premises clients and your storage. 

Managed Storage Service

For managed storage services like EFS or FSx, those solutions…First of all, they service up many different companies or customers and they throw security in so company A can get to company B. The other end of that is they limit the amount of throughput that each company has to access storage so one of them does not become a noisy neighbor.

Then we have like SSD and HD in cold storage on the backend there that we have access to, so we can always just go to a faster disk or things like that to get better performance, but that comes with a price.

To really think about increasing the performance for a managed storage service, we have two main levers we can pull and push. It’s we can purchase more storage or we can purchase more throughput.

Now, this chart is based off the AWS EFS and FSx website. What it’s telling me is the more gigs I have purchased; the more throughput I have.

Increase Performance by Increasing Storage

For instance, the very top one on EFS, if I purchase 10 gigs, I get 0.5 per second and it’s going to cost me $3 a month. If my solution requires that I have 100 megs of throughput, I need to purchase two terabytes of storage. That will cost me $614 a month, and that’s fine.

To get more performance, I can increase the amount of storage. I can also increase my throughput by purchasing provisioned throughput.

In this instance here, in this case, I needed that 100 megs of throughput, and I can purchase that. I’m basically buying two terabytes of storage. Even if I only have one terabyte of storage but I need 100 megs of throughput, I’m basically still purchasing two terabytes of storage but I’m provisioned with 100 megs.

The same thing with 350, I’m eight terabytes. With 600 megs, I’m basically purchasing 16 terabytes of storage for throughput performance that I require. Now that’s great.

Like I said earlier, in a lot of applications, that runs very well and it’s fine. But for those applications that it doesn’t quite work for, they turn to SoftNAS. They turn to a cloud NAS.

Cloud NAS

With SoftNAS, I can use an efficient protocol. If it’s a SQL Server solution, I can use iSCSI. If it’s Linux, I can use NFS, windows, I can use CIFS, and it could all connect to the same amount of storage I can control and manage off my NAS.

I can use different disks speeds or more disks. I can use more smaller disks than fewer larger disks in my RAID array. I can do things like that. For this conversation, we’re going to talk about the cool ways to increase performance and manage costs.

I can increase the compute instance, and I can increase the read/write cache of my SoftNAS. Let’s dive into that. In AWS, I have compute instances for my NAS. In this case, I have an m3.xlarge. I have 4VCU and 15 megs of RAM, and I get 100-megabyte throughput.

That’s the throughput from my NAS to my clients because all the storage is directly connected to my NAS. I have a c5.9xlarge, 36 VCPUs, 17 megs of RAM. I get 1,200 megs of throughput. Remember that’s from my NAS to my clients.

For my NAS to my storage, that’s directly connected to my NAS. In AWS, I can connect about a petabyte of storage. I can also utilize caching. We know how important caching is. On my NAS, I have an L1 cache which is RAM, and I have an L2 cache which is a dedicated disk to a specific storage pool.

With SoftNAS, by default, I use half of the available RAM. If I have 32 megs of RAM, I can get 16 meg of that by default, available for L1 cache.

For L2 cache, I can use NVMe or SSD. For instance, let’s say I have a SQL Server solution that I’m providing data to and all my storages are on an array. That’s great. I can have NVMe dedicated to that pool servicing that SQL Server and my solution is just humming.

Then I could have another pool, let’s say it’s all my web servers data, that are in an array. I can have all of that with a bigger SSD drive for the cache. I can tune my solution in my NAS controller.

We ran a performance benchmark on AWS EFS against SoftNAS on AWS. Because so many customers were coming to us and saying, “We came to you because you gave us the performance that we needed,” we really wanted to dive down and try to figure out why or how.

To just summarize, throughput is a measurement of how fast your storage can read and write the data. In IOPS, the higher the IOPS; the faster you have access to the data on that disk. Latency is a measurement of the time it takes for a component or a subsystem to process the data request or transaction.

How do we do it? Well, we have a Linux fio Server with four Linux fio clients through NFS connecting either to SoftNAS with no L2 cache or AWS EFS, with SSD storage on both. We had a basic, medium, and high.

Basic was 100 Megabits per second. Medium was 350 megs per second. High was 600 megs per second. We gave them different amounts of storage to give more throughput to the storage on AWS EFS.

For storage throughput, what we learned was the more RAM, CPU, and network we gave to the NAS the better numbers we got. And we were able to provide continuously sustained throughput and predictable performance because we weren’t throttling any other customers.

Throughput (MiB\s) – Higher is Better

What we found was pretty remarkable. On our basic, we are almost twice the performance. When we got to our higher-end, we are 23% at all times — 23 times the performance in the higher end for read/write sequential and read/write random of a 70/30 combo there.

On IOPS, again what we learned was the more CPU, RAM, and network speed, we got better IOPS. We know we could have used a faster disk such as NVMe, and that’s how we got a million IOPS on AWS. We could have added more disks to an array to aggregate and increase the disk IOPS. We could have done that also, but we didn’t.

IOPs – Higher is Better

What we found is, again, we were almost two times the performance on IOPS on our basic. I think we are around 18 times on the medium and 23 times the performance on the higher end — just huge numbers that we are really proud of.

For the latency, how can I get that data or the time it takes to get that data off and on that disk. What we found is no surprise. If I increase the CPU, the RAM, and network speed, I decrease latency. I could have decreased latency by using NVMe, but that adds a substantial cost.

Latency – Lower is Better

I could have decreased latency by using smaller and more disks than larger disks. Again, these results were kind of the same. We were two times lower. We’re about 18 times lower in our midrange there. In our high end, we’re about 23 times lower.

Now, all of that data, we’ve published. It’s on our blog in burst.com and it’s an I-chart. If you want to dive into this data, we’d be more than happy to talk with you on how we achieve these numbers, try to replicate these numbers, or give you a demo on how we think we can increase the performance of your specific application.

Let’s talk about specific applications where’s there’s throughput, IOPS, or latency.  For throughput, if you have many client connections like a web server array or even just cloud virtual desktops or actual desktops on-premises, you have lots of clients accessing the data. You need more throughput.

Maybe they are accessing video files, office files, AutoCAD files, or web server content — more throughput. Like one of our great partners, Petronas, had many clients out there that needed access to the content. SoftNAS was able to handle the amount of clients accessing the data.

When we’re talking about IOPS, we’re talking about small block size transactions like a database server or an email server who need to access the data and put data in very small chunks. We have a great partner, Halliburton, for instance. Their application for the oil and gas industry — their landmark application — they take the seismic data, just massive amounts of data.

It has to pull it off of the disk and then render it to show it visually to the plate-tectonic engineers. They were able to get their application, import it, and move to the cloud in record speed with SoftNAS and then have it run at a great performance level with SoftNAS. Absolutely fantastic.

On latency, think about applications like banking, stock exchange, finance who need fast access in and out of that data as fast as they can, or streaming. Netflix is a great partner of ours. To provide the solution for great cloud partners like AWS accessing the cloud storage, providing NetApp to the application.

How do I increase performance with AWS applications? We’ve got to understand what the bottlenecks are. How does the application perform? Does it need throughput? Does it need IOPS? Does it need low latency?

How to Increase Performance for Your AWS Application

  1. Understand your solution and where the bottlenecks are
    • Throughput
    • IOPs
    • Latency
  2. Then understand if managed storage will work for you
  3. Reach our to SoftNAS to better understand your performance options to develop a solution to meet the specification of your workload

When we understand that, then we can understand if managed storage can work for you. That’s great. If it’s not, come talk to us. Meet a cloud storage performance professional and just talk about some ideas, what you’re seeing out there, why won’t it work. We’ve helped so many customers.

We have helped tons of customers get their applications running on AWS. We were the first NAS up there. We started that whole industry of NAS in the cloud. We have lots of information on performance blogs. We have an e-book.

We have a dedicated performance web page at burst.com/performance. We’d love to show you our product. To get a demo, talk to a performance professional. At Buurst, we are a data-performance company. That’s all we want to do, live and breathe and think and provide you the fastest access to your cloud storage for the lowest price.

I’m Jeff Johnson. I am at Buurst. I am a Buurster. Thank you for listening. Please, let us help you get those applications running. Thank you.   

Buurst Now Available in the Microsoft Azure Marketplace

Buurst Now Available in the Microsoft Azure Marketplace

Microsoft Azure customers worldwide now gain access to Buurst to take advantage of the scalability, reliability, and agility of Azure to drive application development and shape business strategies.

September 01, 2020 08:00 AM Eastern Daylight Time

BELLEVUE, Wash.–()–Buurst, a leading enterprise-class data performance company, today announced the availability of its flagship product, SoftNAS, in the Microsoft Azure Marketplace, an online store providing applications and services for use on Azure. Buurst customers can now take advantage of the productive and trusted Azure cloud platform, with streamlined deployment and management.

“The availability of our SoftNAS product in the Microsoft Azure Marketplace enables us to offer these key benefits to a wider range of organizations.”

  Tweet this

“Through our solutions, we strive to provide our customers with better application performance, lower cloud storage costs, and the control they need,” said Garry Olah, CEO of Buurst. “The availability of our SoftNAS product in the Microsoft Azure Marketplace enables us to offer these key benefits to a wider range of organizations.”

Buurst is dedicated to delivering new levels of data performance, control and availability to position businesses to move, access and leverage data quickly. The company and its innovative solutions offer impressive levels of performance in the cloud, having reached 1 million input/output operations per second (IOPS), and provide a patented cross-zone, high-availability with a 99.999 percent uptime guarantee, giving customers true control over their data in the cloud.

Buurst’s flagship product, SoftNAS, offers customers control by providing the resources required to develop a new environment and enabling businesses to apply the configuration variables they need to get the maximum performance for petabytes of data on Azure. Additionally, businesses can significantly reduce the cost of cloud storage through SoftNAS’ optimization of Azure’s premium and standard managed disk storage, as well as leveraging its deduplication, compression and tiering capabilities. SoftNAS optimizes data performance while keeping costs in check for businesses.

Sajan Parihar, senior director, Microsoft Azure Platform at Microsoft Corp., said, “We’re pleased to welcome Buurst to the Microsoft Azure Marketplace, which gives our partners great exposure to cloud customers around the globe. Azure Marketplace offers world-class quality experiences from global trusted partners with solutions tested to work seamlessly with Azure.”

The Azure Marketplace is an online market for buying and selling cloud solutions certified to run on Azure. The Azure Marketplace helps connect companies seeking innovative, cloud-based solutions with partners who have developed solutions that are ready to use.

About Buurst

Buurst, Inc. is a leading enterprise-class data performance company that delivers migration, cost management, and control of data in the cloud customers need. Buurst optimizes cloud storage decisions for organizations, from migration to granular monitoring and management to storage tiering for cost performance, across all major cloud platforms, ensuring superior performance and optimization of business-critical data. Buurst has offices in the Seattle and Houston areas and employees located across the globe. Buurst powers some of the largest enterprises, including Samsung, Halliburton, T-Mobile, Boeing, Netflix, L’Oréal and WWE. For more information, visit www.buurst.com.

Click here for link to original Business Wire Press Release