How UltraFast Uses Reinforcement Learning to Tackle Tough Network Conditions

How UltraFast Uses Reinforcement Learning to Tackle Tough Network Conditions

Latency and packet loss over wide area networks, the Internet and RF-based network devices (e.g., satellite, cellular, packet radio) has long been a barrier to large scale data transfers. TCP/IP’s windowing algorithm is infamous for reacting very poorly to packet loss, which reduces the amount of data TCP is willing to send per transaction, making it extremely slow yet reliable. Today, large amounts of data increasingly needs to be transferred from where it gets created to elsewhere across available networks to where it is consumed and used. Sometimes this data is purely for disaster recovery and backup, other times its for important analytics and other business processes. Edge computing promises to address some of these issues by moving workloads closer to the point of data creation, but even then, data must often be transferred to centralized locations (data centers, public clouds, SaaS services) to make use of the insights gained across many edge sites.

Buurst Fuusion’s UltraFast® Machine Learning Approach

Over the years, many different types of algorithms have been devised to try and address this network throughput optimization problem. Buurst’s Fuusion product includes a feature called UltraFast®, which overcomes the challenges posed by TCP over highly latent or lossy networks in a unique way. As we will see in this post, UltraFast utilizes a type of AI/ML technology to learn, adapt and optimize data transfers over troublesome network conditions.

The UltraFast Gambler Agent

UltraFast uses a machine learning process that uses a set of “gamblers,” or data transmission experiments, that place different “bets” on what the ideal transmission rate will be. There is no model of the network available ahead of the agent running its own experiments to learn about the network.

The main goals are to:

  1. Maximize network throughput by sending as much data as possible
  2. Avoid creating packet loss due to sending data too quickly
  3. Detect when external factors, such as other network participants, changing IP routes, and other dynamic conditions are causing congestion or interfering with packet throughput and use this information to place improved bets.

The Agent creates a set of “Gambler” processes, each running in an independent thread. Gamblers are given a “Data Transmission Bet” to place, said bet being its data transmission “rate”; i.e., a bet is the time delay between sending each packet. The data is sent to a connection at the distant end of the network, and several types of responses may occur: 1. an ACK indicating good data receipt, 2. a NAK indicating bad data receipt, or 3. no response at all, indicating a lost packet (timeout). Each Gambler process sends several data packets and records the overall success rate, i.e., how many packets were sent, how many succeeded, and how many failed. Upon completion of the Gamblers’ processing, each Gambler is assigned an overall score. The more acknowledged and successful data packets sent, the higher the score. The more NAKs or timeouts (packet losses) present, the lower the score. As we will soon see in more detail, the Agent then uses these Gambler scores to reward successful gamblers, which are allowed to “breed” and multiply during the next generation or experiment cycle. Less successful or failed Gamblers are pruned and eliminated. This process is similar to natural selection, where the strong and successful survive, and the weak and unsuccessful do not propagate.

UltraFast Reinforcement Learning Process

The UltraFast Reinforcement Learning Process The chart below depicts the UltraFast learning cycle and each step in the process.

The UltraFast learning loop runs repeatedly, processing these steps:

  1. A Monte Carlo derived genetic algorithm generates random strategies for the initial set of gamblers, then subsequently breeds new gamblers based upon last cycle’s winners’ results.
  2. A new generation of dozens of gamblers is created at the start of each cycle, each with its own rate of sending data packets.
  3. Gamblers send their data packets, measuring ACKs, NAKs, and lost packets.
  4. Each gambler’s win/loss rate is scored – more packets sent equals a higher score, lost packets or data transmission errors (NAKs) penalize the score.
  5. Each gambler’s loss-rate is compared with the current loss-zero (separately established with regularly timed packets).
  6. Winning gamblers showing the best results are rewarded by being bred, resulting in similar successful gamblers for the next cycle. The agent prunes the losers and feeds the learned results forward into the genetic algorithm. In addition to the ‘successful’ newly created gamblers, new random variants are added to further explore the newly defined boundaries, enabling the system to adapt to changing network conditions.

The above 6-step process runs continually, optimizing data throughput while minimizing packet loss and congestion, and adapting to the constantly changing and evolving complex network environment. Reinforcement learning enables UltraFast to adapt to each unique network topology and navigate its evolving traffic and routing conditions.

UltraFast Speed Test

UltraFast includes a speed test feature, which sends “iperf” data through the UltraFast optimizer as both a download stream and then an upload test. This is analogous to your typical Internet or broadband speed test, except it uses UltraFast technology to compare the throughput results vs. plain TCP/IP. In the following screenshot, we see the TCP results displayed in red (mostly hidden behind the blue UltraFast chart). The link being tested is on AWS between the Ohio region in the USA and the Capetown region in South Africa. The latency averages around 250 milliseconds round trip time, with little to no packet loss.

TCP/IP averages just 144 Mbps over this moderate-latency 1 Gbps link. We can see during the initial Download Speed test, the blue (cyan) UltraFast chart slowly increase its throughput over time, as the gamblers run and the reinforcement learning algorithm actually learns the particular characteristics of this network. Once UltraFast learns the network, it eventually is able to peg the network at near 1 Gbps at times. Then the upload test starts. Since UltraFast has already learned this network, it immediately optimizes the throughput, averaging 822 Mbps vs. TCP’s 144 Mbps.

As network conditions vary over time, UltraFast’s intelligent learning algorithm continues to observe, adapt and learn in order to continually optimize network throughput. This is very important for long-running bulk data transfer jobs in the terabytes or more size. As these long-running data transfer jobs occupy large amounts of network bandwidth over time, they are much more likely to experience competing traffic at different times of day; e.g., backup jobs running overnight, user downloads during daytime hours, and many other variables, including network routes changing the underlying network characteristics over time.

Summary

Optimizing data throughput over challenging network conditions is an age old problem – one that now has a new type of solution that uses reinforcement learning to intelligently optimize and constantly learn and adapt to changing network conditions. To learn more about the UltraFast feature of Fuusion and how it addresses challenging, high-latency and lossy network conditions, please visit the Fuusion page for more information. For more detailed insights into UltraFast, its machine learning technology and overall architecture, you can download the UltraFast technical white paper.

Verifying Snapshots on SoftNAS for Compliance for Halliburton

Verifying Snapshots on SoftNAS for Compliance for Halliburton

As we grow and evolve our Fuusion product, we are constantly finding new use cases, and learning to better respond to customer requirements. Halliburton is one of Buurst’s earliest customers, using our SoftNAS product long before the advent of Fuusion. So, when Halliburton came to us looking for a solution related to our SoftNAS product, we rose to the challenge. And because our solution to Halliburton’s problem involved both of our products, it seemed the perfect opportunity to illustrate at a high level how Fuusion was used to meet their needs.

Halliburton operates over 200 SoftNAS servers across their infrastructure, leveraging both AWS and Azure cloud storage. For compliance purposes, Halliburton needed to ensure and document that for each SoftNAS instance or VM, snapshots were being performed as scheduled. This needed to be performed in an automated fashion, with minimal manual effort across all the 200 virtual machines. This solution was of double interest to us because it allowed us to not only prove the flexibility of our Fuusion product, but to verify that SoftNAS’ built-in snapshot solution operated as intended across a large deployment.

In order to show how our solution works, we created a POC on a smaller scale.

We hope this smaller scale deployment proves an ideal introduction to our Fuusion product in operation.

Identifying Hosts

The first step to verifying each of the 200 SoftNAS deployments operated as intended, is of course to identify the hosts. This proved an easy task, as all it required was compiling a CSV list of all IP addresses or host names. To verify our flow, we started with a small sample size of five IP addresses, two working IP addresses based on a sample environment, and 3 non-existent IP addresses to simulate failures. In Halliburton’s production environment this list would compile the IPs or hostnames of each SoftNAS deployment in their environment.

This CSV is fed into the first processor of the Fuusion flow. The processor, named Get_host_list, obtains files from each live host via the IP addresses or hostnames provided, and a python script running in the background. The processor script grabs snapshot details from each live instance.

Splitting the Records 

Next, the records need to be split on a per instance/VM basis, into successful connections and failures. This is done via a processor we called simply “SplitRecord”. This processor took the data from these instances, existent and non-existent, and created records with the same file name, but each with a different host (UUID) record.

Looking into one of the flowfiles and its attributes, we can see many different attributes that can be called upon in later steps, as necessary. For our purposes, the attribute we are most interested in is the fragment count. The fragment count attribute tells Fuusion that there are 5 different fragments to this single record. Knowing this, and the UUIDs, allows Fuusion to determine the fragments belonging to the record, and allows them to be re-assembled upon request. 

Execute String 

This step verifies the output of each SoftNAS connection, and the data requested from each by the python script in the processor mentioned earlier. As you can see, this record has successfully pulled the hostname, platform, volume name, snapshot count, the age of the last snapshot (LAST_AGE), whether snapshots are enabled, on what schedule, and how long they are retained, for each volume on the instance. The volumes are listed below, with values provided for each of the variables above. 

If the connection fails, on the other hand, this data is not available. So, for the three non-existent IP addresses (purposely created to show how to handle failed connections as you will recall), we instead see execution_status_0 in the Flowfile attributes, indicating a failed connection. This attribute allows us to sort the failures separately, as we will see shortly. 

Update Attribute Step

The Update Attribute processor’s job is simply to rename the files based on success or failure. Successful connections with the data pulled from valid servers are re-named AWS_SNAPSHOT_RESULTS.

The failed nodes (based on the execution_status_0 attribute mentioned earlier), are renamed AWS_ERROR_HOSTS. Remember, even though we have the same filename applied to each file in each category, they are still differentiated by separate UUIDs, and can still be recompiled based on the fragment count attribute.

“Notify” and “Wait”

In typical configurations, Fuusion flows are not configured to perform batch operations. But as you will see, with some creativity and ingenuity, Fuusion is flexible enough to manage just about anything. To manage this requirement, we needed to leverage some pre-existing controller services in a creative fashion, notably a distributed map cache, similar to services such as DynamoDB or Reddis – anything with a key value store. To put it simply, the key is something we need to count, and the value is that count. The Notify processor tells us about that count (the count being the successes or failures to be sorted). The signal identifier is a made up value simply called ‘release’. The signal counter is a key called ‘process’ record, and each process_record will increase the count by an increment of 1. Each increment is then stored in the Distributed Map Cache.

The coolest part of this is that there was no need to set up a separate service such as DynamoDB or Reddis. We were able to leverage the rich variety of controller services already present to create our own solution.

With the “Wait” processor, we are essentially telling the flow when to proceed further, ie, when to run the batch process, by listening for the “release” signal identifier. The Wait processor is going to find that fragment_count attribute mentioned earlier, and look for records from Notify until the counter reaches the value specified by the fragment_count, which we know to be five. Five fragments come into Notify, and five go out, split based on defined variables, all automated in the flow to this point.

Route on Attribute 

So, with Notify and Wait, we tell the flow when to proceed. We’ve split the fragments apart and have attributes labelling them on success and failure, and now we are ready to begin putting them back together. The first thing we need to do is to merge successes together, and merge failures together. We do this by sending the fragments in different directions within the flow, using the Route on Attribute processor. This is done quite simply by sending the fragments in either direction based on a value we’ve seen before, the execution_status. With a simple NIFI Expression Language command, the files are sorted and split to two basically identical processors called MergeRecord 

Fragments (Flowfiles) with an execution_status of 1 are sent to the right (successes). As you can see, 2 files have been sent to the MergeRecord processor, and as we know, 2 of the five IP addresses corresponded to live SoftNAS virtual machines. 

Those with an execution_status of 0 (or failed connections) are sent to the left. As you can see, there are 3 fragments sent to the Merge Record Processor, corresponding to the 3 invalid IP addresses.  

Each of these MergeRecord processors compile the fragments together, successes put together in one, and failures in the other. Finally, the PutFile processor creates a single file out of the merged records, preparing it to be sent out to a shared storage location, in this case an S3 repository.  

So, with this flow, we were able to suck raw data up from a given source (in this case SoftNAS), assign attributes to split and separate the desired data, and tell it where to go, then put it back together in the desired format, all before hitting a central repository.  

Once sent out to S3, it can be retrieved by another flow, to apply additional formatting if necessary, such as formatting it into a CSV or other file format ready for consumption like the one below 

This CSV output, fully automated and tailored to the needs of the client in a standardized format, or a considerably larger one containing all 200 of Halliburton’s servers and their current snapshot status can then be input directly into any business intelligence tool you specify.   

Remember that while this is a very simple example, the same principles can apply to data from any source, and can be split and recompiled in the same manner based on any attribute defined. That’s powerful stuff, if defining a data flow. There are any number of use cases that can benefit from just the basic principles illustrated here 

More Information

Get a Fuusion Demo to find out how we can automate your biggest dataflow challenges, or even just take the hassle out of some of your smaller ones. 

Fuusion Use Case: Real-Time Local Edge Processing 

Fuusion Use Case: Real-Time Local Edge Processing 

Fuusion Use Case: Real-Time Local Edge Processing 

As the owner of a mid-size manufacturing company, you are being challenged daily with how to cohesively manage your data across all your production sites.  You are wanting to analyze data at each site to improve productivity at each site, but also need to track long-term trends to strategically guide your business in the future. 

You have dozens of sites spread through North America from one coast to the other and everywhere in-between. You’re doing brisk business, but you know it could be better. You just need to tap into the data your sites are generating to get to the next level. 

That data could help you to improve sales, figure out where supply chain issues affect production, and to pinpoint inefficiencies at production site levels. Having easier access to real-time (or near real-time) data and historical batch analytics will help to find and address the gaps and issues that are holding you back. 

Generating the data isn’t the problem. You already have terabytes of data being created daily, but it’s just sitting idle for far too long. You need to aggregate and store the data so you can easily get to it when you need it. Additionally, you need to process the data into a format that you can easily analyze with the company software you already have and with cloud services. 

Yes, processes are in place for individual sites, and some analysis has already been done on a one-off basis. But you need an automated process that will allow you to centrally manage all your data from a single machine data point up to the entire company production view. Also, data formats cobbled together for only specific locations don’t help out the bigger picture and slow everything down. You need source data that has not been manipulated so you can provide an even, unbiased account of your business. 

You need consistency across every single site. And you need to process that data automatically in real time. Aggregating it quarterly, like it’s done now, isn’t doing you any good. It seems impossible, or at the least, at an exorbitant cost. You want to analyze data right now across all your sites, without headaches, to improve day to day operations and increase productivity. 

It’s time to start laying the correct groundwork. Define the datasets you need. Nail down the file formats that your company software can use. Start tracking long term trends to strategically guide future business decisions.  This is where Fuusion comes in. 

Fuusion can help you achieve your goals of managing your data across multiple sites.  Plus, it can help you route your data to cloud services so you can run long-term analytics to understand your business trends to ensure future success. 

So how does Fuusion work?  Fuusion connects and ingests data from all your machinery and ERP at each site, no matter where it is located. Not only will it connect and ingest the data, but it can perform pre-defined operations to format and prepare the data for delivery, as well as defining the locations it will go.  Next, Fuusion can process your data locally, in pre-defined formats, and boost the speed at which the data arrives centrally. Your data is then processed in the desired formats, integrated with popular AI/ML frameworks for low-latency inferencing. Lastly, the processed data is routed to the cloud, where it can be aggregated with cloud services so you can compare it to historical data and observe trends from long-term analysis. With the right plan of attack, and the right tools, even the impossible becomes possible! 

Islands in the Stream

Islands in the Stream

Data is everywhere, flowing through our networks like the rivers and streams in our physical world. And like the physical world, it accumulates wherever the conditions allow. With water, this results first in ponds, then lakes and eventually oceans. In the data world, the same can happen, vast reservoirs of data that could easily become as hard to plumb as the ocean itself. These pools of data can occur anywhere, but particularly in locations where the means to gather it are not naturally present, or there are natural barriers preventing its flow. Information technology terms this the edge, and the information that resides outside the traditional datacenter, in these difficult to access reservoirs, edge data.

Buurst’s goal, whether with our storage management solution, SoftNAS, or our new edge data consolidation and orchestration product, Fuusion, attempts to bridge the divide between useful, usable data, and the current potentially inaccessible oceans of data your organization might currently generate.

Because of the difficulties in transporting multiple large and small streams of data and orchestrating these flows in real time, a more agile infrastructure ideal is being proposed, one where processing occurs increasingly at these remote locations. Gartner tells us that the infrastructure of the future will be “anywhere the business needs it”, and that by 2022, 60% of enterprise IT infrastructures will focus on centers of data, rather than traditional data centers.

Increasingly, we’re seeing the advantages of processing at the edge locally. You can reduce your cloud cost and cloud spend quite a bit, and give yourself much faster turnaround times at the edge. Or in some cases, it may be the only thing feasible because you need maybe millisecond or even sub-millisecond responsiveness at the edge.

Rick Braddy, CTO, Buurst

This does not mean that centralized data management will go away entirely – as Adam Burden, Accenture’s technology lead for North America, puts it “The biggest issue of creating ‘centres of data’ is the underlying architecture and technology ‘ballet’ needed to ensure there is a consistent version of the truth – no matter the data lake, the system or data stream being interrogated.” This means that regardless of how many loci of data gathering, you have to ensure that the data remains the same, without duplication or differing results due to different gathering methods or criteria in reporting. In an article by ITPro, duplicate efforts is identified as one of the primary pitfalls facing the industry. “Many organisations are paying for resources and tools across multiple centres – HR might be building processes and storing data one place, while legal and finance are each doing the same elsewhere.”

For this reason, a solution that is centrally managed, yet can easily extend across multiple locations and scale accordingly is the ideal solution. Essentially, we want to create islands in the data stream. On these islands, depending on size and scope, we can create the foundations of bridges that will allow unfettered access to verifiable source data, or create dams to control or redirect the flow, as well as parse and filter out the desired data. This actionable data could be considered analogous with hydro-electric energy, empowering the datacenter.

Buurst’s Fuusion consists of two parts – a Fuusion Controller solution which orchestrates the dataflows from the edge and handles the centralized management. At the edge, often in containers such as Kubernetes, or in small virtual machines hosted on-site, is where you will find Fuusion Edge, gathering, parsing, and delivering pre-defined data flows to where it is needed. To ensure that the data gathered is accurate and unchanged, Fuusion provides clearly defined tracking and provenance information at every step of it’s journey, at every processor it touches.

Connect Off-Cloud Data to On-Cloud Services

One of the key challenges in gathering data of this nature – data generated at the edge – are network connectivity issues, increased latency, or schedule-reliant connectivity (such as satellite uplinks) and other network difficulties. Whether it’s an oil rig at the far reaches of the prairies or a ship reporting locational data only when satellite connectivity is available, the organization in question will have to make the most of the windows of connectivity available. Fuusion handles this with it’s patented UltraFast feature, leveraging the full available bandwidth by pushing the flow of data across UDP instead of latency inhibited TCP protocols.

Another key challenge to overcome are the numerous formats that data generated at the edge can take, word documents, excel spreadsheets, SQL and NoSQL data, JSON, Salesforce – or any combination of these, and more. Any solution provided must be flexible enough to handle the data generated, and either parse it into a usable format (process it at the edge) or transfer it to a location where it can be processed in a clean, unchanged format. On the processing side, Fuusion’s Apache NIFI based processors are able to natively handle multiple common file formats out-of-the-box, leveraging NIFI’s longstanding open-source efforts. But in addition to these pre-configured processors, Fuusion offers custom processor capabilities, allowing our professional services to create a solution where there was none before.

For those dataflows where the data needs to be kept intact for future processing, Fuusion offers clear provenance, tracking the flow of data from the beginning of its journey to the end, no matter how many stops along the way. Fuusion also ensures that if a flow is stopped, it will resume the moment connectivity is re-established, right where it left off, by comparing data to the last processor touched. Each processor can rely on the previous.

Finally, we have another key problem at the edge – support infrastructure. In the remote locations where edge data is generated, access to server infrastructure is limited, if it exists at all. Fuusion flexibility is not just about the data formats we can handle, it is also about deployment and scalability. As Rick Braddy told us in a recent webinar, “We can deploy on physical machinery, VMs or containers. We can live within a Kubernetes cluster if you are already running Kubernetes out at the edge. Or even on different cottage type of devices, like a Snowball edge, or Azure Stack. And then also we can of course run on hyper-converged, which is still just a virtualized infrastructure, and all of this (can be) centrally managed”.

Buurst’s Fuusion is uniquely equipped to handle the rivers and streams of data meandering across your landscape. Rather than let them become oceans to sift through, we can help ensure that islands of actionable, real-time data solutions are within your organization’s reach. Contact Buurst Professional Services to learn how.

Cloud to Edge Data Distribution Over High-Latency and Satellite Networks

Cloud to Edge Data Distribution Over High-Latency and Satellite Networks

Background:

Over the past 8 years, we have seen more than an exabyte of data migrated onto the leading public cloud platforms. This makes public cloud a new center of the business data universe for many enterprises and government agencies today.

Now we see the rapid rise of Edge Computing coming next, where up to 50% of all new data creation will take place over the next several years from the edge. Indeed, we see traditional on-premises storage, server and network vendors turning their attention to becoming the arms providers to fuel the growth of the edge and its IoT cousins, to restore growth and health to their IT infrastructure related businesses, which have suffered due to cloud computing migrations supplanting traditional datacenters.

As we introduced the Buurst Fuusion™ product into the market in 2020, at first, we focused on what came natural to us after helping customers migrate thousands of applications and petabytes of data from on-prem into the leading clouds – moving data into the cloud. What we discovered came as a bit of a surprise at first. Of course, there’s still massive data transfers and migrations to the cloud, but now we see the rise of the edge creating a gravitational pull on the data stored centrally in the clouds to fuel edge computing.

Edge Is Data Hungry

We have heard about edge data creation intensity, fueled largely by IoT sensor data feeding many edge computing systems. What we don’t hear as much about is the fact that these often standalone, headless edge nodes require care and feeding from a central command and control system, which is increasingly hosted in the cloud.

Edge systems require data for:

    • Software updates
    • Configuration settings
    • Inference engine updates
    • Code deployments and updates
    • Container deployments

Edge computing systems need to be installed, configured, updated and cared for like any other IT system. What are some examples of edge systems that need centralized control?

    • Offshore Rigs
    • Shipping Vessels
    • Military Systems
    • Electric Vehicles
    • Commercial Drones

And since many of these edge systems exist at remote locations, the networks connecting edge and cloud are often less than ideal. In fact, many remote edge systems must rely upon satellite, radio, DSL, and in the future, 5G networks for their connection with the rest of the world. These edge networks often bring high levels of latency, packet loss and sometimes intermittent connectivity, unlike the pristine, redundant network conditions we see in the cloud and traditional data centers.

To successfully deploy and maintain edge computing systems remotely over challenging high-latency, lossy network conditions between systems using incompatible data types, several solutions are needed:

    1. Store and forward with guaranteed delivery
    2. Optimizations to overcome TCP/IP issues with high-latency and packet loss
    3. Data format flexibility across disparate systems.

Let’s examine each of these data distribution challenges in more detail.

Store and Forward with Guaranteed Delivery

Particularly with satellite networks using mobile access points (e.g., ships at sea, military personnel and equipment), connectivity is intermittent and can be lost entirely for unpredictable periods of time. During network outages, it’s critical that edge systems can queue up data that needs to be transmitted so it isn’t lost. It’s equally important in many cases to ensure guaranteed delivery to maintain transactional integrity.

Optimizations to overcome TCP/IP issues with high-latency and packet loss

It’s common knowledge that TCP/IP falls over from a throughput perspective in the face of network latency and packet loss errors. Latency is commonly introduced by satellite and other radio (e.g., 4G/5G) communications systems. Packet loss exacerbates the effects of latency on TCP/IP by causing TCP’s window size to shrink. The results can be readily seen in the following chart.

It’s easy to see how TCP’s throughput becomes unusable with any appreciable amount of packet loss and latency, limiting the effectiveness of cloud-to-edge data distribution and edge-to-cloud data consolidation efforts in many real-world use cases.

Data Formats – Object Storage, File Storage and SQL/NoSQL from the Clouds

Data is stored in the cloud in many different formats. Object storage is the most common, such as S3 on AWS® and Azure Blobs on Microsoft® Azure cloud. Unstructured data continues to be managed as filesystems, and increasingly by NoSQL databases. Structured data can be found in various SQL databases, as usual.
Different edge devices operate upon and create data in their own proprietary formats, often access via REST API’s.

Some means of data extraction, combining fields of related data and then transforming the data format at the proper place in the edge/cloud continuum is required.

How Fuusion Addresses Data Distribution between Cloud and Edge

Guaranteed Delivery. Fuusion leverages Apache NiFi technology to manage data flows. Data gets queued between each data flow processing block, as well as when sent across the network. Even at very high scale, guaranteed delivery.

UltraFast® Data Acceleration. Fuusion includes a key feature that optimizes end-to-end data transfer over latency/packet lossy networks. It does this by intercepting TCP traffic via a SOCKS proxy, redirecting it through a proprietary UDP channel that ensures reliable delivery and that overcomes the effects of latency and packet loss using patented technology. To contrast the data throughput in the face of latency and packet loss, consider this performance chart that reflects UltraFast throughput.

As we can see from the above chart, throughput remains reasonably constant, at 90% or better, regardless of the network link’s latency and packet loss characteristics. UltraFast automatically detects latency and packet loss and constantly adjust and optimizes to maximize throughput.

How Fuusion Addresses Data Format Flexibility Challenges

Fuusion supports dozens of common data formats, including files with various formats (e.g., XML, JSON, etc.), SQL, NoSQL and many cloud services via either specific “connectors” or custom REST API integrations. Instead of involving DevOps for coding, Fuusion uses a powerful drag and drop visual data flow configuration approach, based originally on Apache NiFi, as a means of quickly configuring data flows and data format flexibility.

Take Action

Schedule a Demo with our Fuusion Team to learn how Fuusion can prepare your organization for the coming edge and data transformation.

Accelerating Data Transmission on High Latency Networks with Ultrafast

Accelerating Data Transmission on High Latency Networks with Ultrafast

Frustrated by slow file transfers over FTP and TCP? Can’t afford IBM Aspera for bulk transfers? Need to transfer GB to TB size files over dirty, high latency networks with packet loss and can’t tolerate the painfully slow TCP transfer rates? Try UltraFast.

Our patented UltraFast technology rapidly transfers data to and from the cloud over WAN networks where packet loss and disruption are common. This technology is a key component of our newly available product, Buurst Fuusion, an edge solution to connect business data from remote locations to powerful data processing cloud services regardless of your data’s original format.

In this post I review the latest testing conducted by Buurst with UltraFast over high latency networks.

UltraFast Research Highlights

This technology was designed to offer superior data transfer capabilities across low-quality networks such as WAN, Satellite and even local networks. With as little as 1% of packet loss UltraFast delivered between 997% (Satellite) to 92% (US – Local) faster data transfer based on the type of network and distance. The overall average data transmission increase of all regions at 1% packet loss was 644.14%. A packet loss rate of 2% averages 1,126.86% increase across all regions tested. These data points compare UltraFast to a basic TCP/IP transmission system.

These differences are significant as UltraFast typically delivered 10GB of data in less than 2 minutes with TCP/IP requiring hours in almost all cases. More details on the research are below.

Fuusion UltraFast™

A patented software-based virtual appliance that provides intelligent, self-tuning storage acceleration technology designed to address the real-world challenges of data transmission across dirty networks. Built on UDP, UltraFast substantially increases data transfer speeds compared to TCP/IP offering users the flexibility to securely transfer data to/from the cloud without requiring networking equipment updates or limiting data selected for transfer.

Frustrated by slow file transfers over FTP and TCP? Can’t afford IBM Aspera for bulk transfers? Need to transfer GB to TB size files over dirty, high latency networks with packet loss and can’t tolerate the painfully slow TCP transfer rates? Try UltraFast™.

Our patented UltraFast technology rapidly transfers data to and from the cloud over WAN networks where packet loss and disruption are common. This technology is a key component of our newly available product, Buurst Fuusion, an edge solution to connect business data from remote locations to powerful data processing cloud services regardless of your data’s original format.

In this post I review the latest testing conducted by Buurst with UltraFast over high latency networks.

UltraFast Research Highlights

This technology was designed to offer superior data transfer capabilities across low-quality networks such as WAN, Satellite and even local networks. With as little as 1% of packet loss UltraFast delivered between 997% (Satellite) to 92% (US – Local) faster data transfer based on the type of network and distance. The overall average data transmission increase of all regions at 1% packet loss was 644.14%. A packet loss rate of 2% averages 1,126.86% increase across all regions tested. These data points compare UltraFast to a basic TCP/IP transmission system.

These differences are significant as UltraFast typically delivered 10GB of data in less than 2 minutes with TCP/IP requiring hours in almost all cases. More details on the research are below.

Fuusion UltraFast™

A patented software-based virtual appliance that provides intelligent, self-tuning storage acceleration technology designed to address the real-world challenges of data transmission across dirty networks. Built on UDP, UltraFast substantially increases data transfer speeds compared to TCP/IP offering users the flexibility to securely transfer data to/from the cloud without requiring networking equipment updates or limiting data selected for transfer.

UltraFast

Latest UltraFast Testing Results

The Buurst engineering team recently conducted a series of TCP/IP vs UltraFast data transfer comparisons over a series of networks in various regions with different amounts of packet loss:

  • Satellite
  • US to Asia : Poor WAN
  • US to Europe : Typical WAN
  • US East to US West : Good WAN
  • US – Local

In each of these use cases; the % of packet loss is fixed from 10% to 0% with 8 different amounts tested for each region. For example, here is the use case for US to Europe:

Fuusion

As you can see, there is significant gain in delivering in transfer rates across networks with packet loss over 0.5% using UltraFast. For example, a 2% packet loss network will obtain a 1,275% better transfer rate.

Overall results for the regions demonstrated the significant improvement of transfer rates with higher packet loss and/or poor WAN connections. For complete results please email us.

Use Case

Next Steps

I encourage you to learn more about our data, cloud, and edge technology and contact Buurst for a demo:

Overview Fuusion – Connecting critical business data at the edge to a centralized cloud service independent of data format and network capabilities

Request a Demo – See how UltraFast impacts overall performance in our new edge data technology