Fuusion Use Case: Real-Time Local Edge Processing 

Fuusion Use Case: Real-Time Local Edge Processing 

Fuusion Use Case: Real-Time Local Edge Processing 

As the owner of a mid-size manufacturing company, you are being challenged daily with how to cohesively manage your data across all your production sites.  You are wanting to analyze data at each site to improve productivity at each site, but also need to track long-term trends to strategically guide your business in the future. 

You have dozens of sites spread through North America from one coast to the other and everywhere in-between. You’re doing brisk business, but you know it could be better. You just need to tap into the data your sites are generating to get to the next level. 

That data could help you to improve sales, figure out where supply chain issues affect production, and to pinpoint inefficiencies at production site levels. Having easier access to real-time (or near real-time) data and historical batch analytics will help to find and address the gaps and issues that are holding you back. 

Generating the data isn’t the problem. You already have terabytes of data being created daily, but it’s just sitting idle for far too long. You need to aggregate and store the data so you can easily get to it when you need it. Additionally, you need to process the data into a format that you can easily analyze with the company software you already have and with cloud services. 

Yes, processes are in place for individual sites, and some analysis has already been done on a one-off basis. But you need an automated process that will allow you to centrally manage all your data from a single machine data point up to the entire company production view. Also, data formats cobbled together for only specific locations don’t help out the bigger picture and slow everything down. You need source data that has not been manipulated so you can provide an even, unbiased account of your business. 

You need consistency across every single site. And you need to process that data automatically in real time. Aggregating it quarterly, like it’s done now, isn’t doing you any good. It seems impossible, or at the least, at an exorbitant cost. You want to analyze data right now across all your sites, without headaches, to improve day to day operations and increase productivity. 

It’s time to start laying the correct groundwork. Define the datasets you need. Nail down the file formats that your company software can use. Start tracking long term trends to strategically guide future business decisions.  This is where Fuusion comes in. 

Fuusion can help you achieve your goals of managing your data across multiple sites.  Plus, it can help you route your data to cloud services so you can run long-term analytics to understand your business trends to ensure future success. 

So how does Fuusion work?  Fuusion connects and ingests data from all your machinery and ERP at each site, no matter where it is located. Not only will it connect and ingest the data, but it can perform pre-defined operations to format and prepare the data for delivery, as well as defining the locations it will go.  Next, Fuusion can process your data locally, in pre-defined formats, and boost the speed at which the data arrives centrally. Your data is then processed in the desired formats, integrated with popular AI/ML frameworks for low-latency inferencing. Lastly, the processed data is routed to the cloud, where it can be aggregated with cloud services so you can compare it to historical data and observe trends from long-term analysis. With the right plan of attack, and the right tools, even the impossible becomes possible! 

Islands in the Stream

Islands in the Stream

Data is everywhere, flowing through our networks like the rivers and streams in our physical world. And like the physical world, it accumulates wherever the conditions allow. With water, this results first in ponds, then lakes and eventually oceans. In the data world, the same can happen, vast reservoirs of data that could easily become as hard to plumb as the ocean itself. These pools of data can occur anywhere, but particularly in locations where the means to gather it are not naturally present, or there are natural barriers preventing its flow. Information technology terms this the edge, and the information that resides outside the traditional datacenter, in these difficult to access reservoirs, edge data.

Buurst’s goal, whether with our storage management solution, SoftNAS, or our new edge data consolidation and orchestration product, Fuusion, attempts to bridge the divide between useful, usable data, and the current potentially inaccessible oceans of data your organization might currently generate.

Because of the difficulties in transporting multiple large and small streams of data and orchestrating these flows in real time, a more agile infrastructure ideal is being proposed, one where processing occurs increasingly at these remote locations. Gartner tells us that the infrastructure of the future will be “anywhere the business needs it”, and that by 2022, 60% of enterprise IT infrastructures will focus on centers of data, rather than traditional data centers.

Increasingly, we’re seeing the advantages of processing at the edge locally. You can reduce your cloud cost and cloud spend quite a bit, and give yourself much faster turnaround times at the edge. Or in some cases, it may be the only thing feasible because you need maybe millisecond or even sub-millisecond responsiveness at the edge.

Rick Braddy, CTO, Buurst

This does not mean that centralized data management will go away entirely – as Adam Burden, Accenture’s technology lead for North America, puts it “The biggest issue of creating ‘centres of data’ is the underlying architecture and technology ‘ballet’ needed to ensure there is a consistent version of the truth – no matter the data lake, the system or data stream being interrogated.” This means that regardless of how many loci of data gathering, you have to ensure that the data remains the same, without duplication or differing results due to different gathering methods or criteria in reporting. In an article by ITPro, duplicate efforts is identified as one of the primary pitfalls facing the industry. “Many organisations are paying for resources and tools across multiple centres – HR might be building processes and storing data one place, while legal and finance are each doing the same elsewhere.”

For this reason, a solution that is centrally managed, yet can easily extend across multiple locations and scale accordingly is the ideal solution. Essentially, we want to create islands in the data stream. On these islands, depending on size and scope, we can create the foundations of bridges that will allow unfettered access to verifiable source data, or create dams to control or redirect the flow, as well as parse and filter out the desired data. This actionable data could be considered analogous with hydro-electric energy, empowering the datacenter.

Buurst’s Fuusion consists of two parts – a Fuusion Controller solution which orchestrates the dataflows from the edge and handles the centralized management. At the edge, often in containers such as Kubernetes, or in small virtual machines hosted on-site, is where you will find Fuusion Edge, gathering, parsing, and delivering pre-defined data flows to where it is needed. To ensure that the data gathered is accurate and unchanged, Fuusion provides clearly defined tracking and provenance information at every step of it’s journey, at every processor it touches.

Connect Off-Cloud Data to On-Cloud Services

One of the key challenges in gathering data of this nature – data generated at the edge – are network connectivity issues, increased latency, or schedule-reliant connectivity (such as satellite uplinks) and other network difficulties. Whether it’s an oil rig at the far reaches of the prairies or a ship reporting locational data only when satellite connectivity is available, the organization in question will have to make the most of the windows of connectivity available. Fuusion handles this with it’s patented UltraFast feature, leveraging the full available bandwidth by pushing the flow of data across UDP instead of latency inhibited TCP protocols.

Another key challenge to overcome are the numerous formats that data generated at the edge can take, word documents, excel spreadsheets, SQL and NoSQL data, JSON, Salesforce – or any combination of these, and more. Any solution provided must be flexible enough to handle the data generated, and either parse it into a usable format (process it at the edge) or transfer it to a location where it can be processed in a clean, unchanged format. On the processing side, Fuusion’s Apache NIFI based processors are able to natively handle multiple common file formats out-of-the-box, leveraging NIFI’s longstanding open-source efforts. But in addition to these pre-configured processors, Fuusion offers custom processor capabilities, allowing our professional services to create a solution where there was none before.

For those dataflows where the data needs to be kept intact for future processing, Fuusion offers clear provenance, tracking the flow of data from the beginning of its journey to the end, no matter how many stops along the way. Fuusion also ensures that if a flow is stopped, it will resume the moment connectivity is re-established, right where it left off, by comparing data to the last processor touched. Each processor can rely on the previous.

Finally, we have another key problem at the edge – support infrastructure. In the remote locations where edge data is generated, access to server infrastructure is limited, if it exists at all. Fuusion flexibility is not just about the data formats we can handle, it is also about deployment and scalability. As Rick Braddy told us in a recent webinar, “We can deploy on physical machinery, VMs or containers. We can live within a Kubernetes cluster if you are already running Kubernetes out at the edge. Or even on different cottage type of devices, like a Snowball edge, or Azure Stack. And then also we can of course run on hyper-converged, which is still just a virtualized infrastructure, and all of this (can be) centrally managed”.

Buurst’s Fuusion is uniquely equipped to handle the rivers and streams of data meandering across your landscape. Rather than let them become oceans to sift through, we can help ensure that islands of actionable, real-time data solutions are within your organization’s reach. Contact Buurst Professional Services to learn how.

Cloud to Edge Data Distribution Over High-Latency and Satellite Networks

Cloud to Edge Data Distribution Over High-Latency and Satellite Networks

Background

Over the past 8 years, we have seen more than an exabyte of data migrated onto the leading public cloud platforms. This makes public cloud a new center of the business data universe for many enterprises and government agencies today.

Now we see the rapid rise of Edge Computing coming next, where up to 50% of all new data creation will take place over the next several years from the edge. Indeed, we see traditional on-premises storage, server and network vendors turning their attention to becoming the arms providers to fuel the growth of the edge and its IoT cousins, to restore growth and health to their IT infrastructure related businesses, which have suffered due to cloud computing migrations supplanting traditional datacenters.

As we introduced the Buurst Fuusion™ product into the market in 2020, at first, we focused on what came natural to us after helping customers migrate thousands of applications and petabytes of data from on-prem into the leading clouds – moving data into the cloud. What we discovered came as a bit of a surprise at first. Of course, there’s still massive data transfers and migrations to the cloud, but now we see the rise of the edge creating a gravitational pull on the data stored centrally in the clouds to fuel edge computing.

Edge Is Data Hungry

We have heard about edge data creation intensity, fueled largely by IoT sensor data feeding many edge computing systems. What we don’t hear as much about is the fact that these often standalone, headless edge nodes require care and feeding from a central command and control system, which is increasingly hosted in the cloud.

Edge systems require data for:

  • Software updates
  • Configuration settings
  • Inference engine updates
  • Code deployments and updates
  • Container deployments

Edge computing systems need to be installed, configured, updated and cared for like any other IT system. What are some examples of edge systems that need centralized control?

  • Offshore Rigs
  • Shipping Vessels
  • Military Systems
  • Electric Vehicles
  • Commercial Drones

And since many of these edge systems exist at remote locations, the networks connecting edge and cloud are often less than ideal. In fact, many remote edge systems must rely upon satellite, radio, DSL, and in the future, 5G networks for their connection with the rest of the world. These edge networks often bring high levels of latency, packet loss and sometimes intermittent connectivity, unlike the pristine, redundant network conditions we see in the cloud and traditional data centers.
To successfully deploy and maintain edge computing systems remotely over challenging high-latency, lossy network conditions between systems using incompatible data types, several solutions are needed:

  1. Store and forward with guaranteed delivery
  2. Optimizations to overcome TCP/IP issues with high-latency and packet loss
  3. Data format flexibility across disparate systems.

Let’s examine each of these data distribution challenges in more detail.

Store and Forward with Guaranteed Delivery

Particularly with satellite networks using mobile access points (e.g., ships at sea, military personnel and equipment), connectivity is intermittent and can be lost entirely for unpredictable periods of time. During network outages, it’s critical that edge systems can queue up data that needs to be transmitted so it isn’t lost. It’s equally important in many cases to ensure guaranteed delivery to maintain transactional integrity.

Optimizations to overcome TCP/IP issues with high-latency and packet loss

It’s common knowledge that TCP/IP falls over from a throughput perspective in the face of network latency and packet loss errors. Latency is commonly introduced by satellite and other radio (e.g., 4G/5G) communications systems. Packet loss exacerbates the effects of latency on TCP/IP by causing TCP’s window size to shrink. The results can be readily seen in the following chart.

It’s easy to see how TCP’s throughput becomes unusable with any appreciable amount of packet loss and latency, limiting the effectiveness of cloud-to-edge data distribution and edge-to-cloud data consolidation efforts in many real-world use cases.

Data Formats – Object Storage, File Storage and SQL/NoSQL from the Clouds

Data is stored in the cloud in many different formats. Object storage is the most common, such as S3 on AWS® and Azure Blobs on Microsoft® Azure cloud. Unstructured data continues to be managed as filesystems, and increasingly by NoSQL databases. Structured data can be found in various SQL databases, as usual.
Different edge devices operate upon and create data in their own proprietary formats, often access via REST API’s.

Some means of data extraction, combining fields of related data and then transforming the data format at the proper place in the edge/cloud continuum is required.

How Fuusion Addresses Data Distribution between Cloud and Edge

Guaranteed Delivery. Fuusion leverages Apache NiFi technology to manage data flows. Data gets queued between each data flow processing block, as well as when sent across the network. Even at very high scale, guaranteed delivery.

UltraFast® Data Acceleration. Fuusion includes a key feature that optimizes end-to-end data transfer over latency/packet lossy networks. It does this by intercepting TCP traffic via a SOCKS proxy, redirecting it through a proprietary UDP channel that ensures reliable delivery and that overcomes the effects of latency and packet loss using patented technology. To contrast the data throughput in the face of latency and packet loss, consider this performance chart that reflects UltraFast throughput.

As we can see from the above chart, throughput remains reasonably constant, at 90% or better, regardless of the network link’s latency and packet loss characteristics. UltraFast automatically detects latency and packet loss and constantly adjust and optimizes to maximize throughput.

How Fuusion Addresses Data Format Flexibility Challenges

Fuusion supports dozens of common data formats, including files with various formats (e.g., XML, JSON, etc.), SQL, NoSQL and many cloud services via either specific “connectors” or custom REST API integrations. Instead of involving DevOps for coding, Fuusion uses a powerful drag and drop visual data flow configuration approach, based originally on Apache NiFi, as a means of quickly configuring data flows and data format flexibility.

Take Action

Schedule a Demo with our Fuusion Team to learn how Fuusion can prepare your organization for the coming edge and data transformation.

Accelerating Data Transmission on High Latency Networks with Ultrafast

Accelerating Data Transmission on High Latency Networks with Ultrafast

Frustrated by slow file transfers over FTP and TCP? Can’t afford IBM Aspera for bulk transfers? Need to transfer GB to TB size files over dirty, high latency networks with packet loss and can’t tolerate the painfully slow TCP transfer rates? Try UltraFast.

Our patented UltraFast technology rapidly transfers data to and from the cloud over WAN networks where packet loss and disruption are common. This technology is a key component of our newly available product, Buurst Fuusion, an edge solution to connect business data from remote locations to powerful data processing cloud services regardless of your data’s original format.

In this post I review the latest testing conducted by Buurst with UltraFast over high latency networks.

UltraFast Research Highlights

This technology was designed to offer superior data transfer capabilities across low-quality networks such as WAN, Satellite and even local networks. With as little as 1% of packet loss UltraFast delivered between 997% (Satellite) to 92% (US – Local) faster data transfer based on the type of network and distance. The overall average data transmission increase of all regions at 1% packet loss was 644.14%. A packet loss rate of 2% averages 1,126.86% increase across all regions tested. These data points compare UltraFast to a basic TCP/IP transmission system.

These differences are significant as UltraFast typically delivered 10GB of data in less than 2 minutes with TCP/IP requiring hours in almost all cases. More details on the research are below.

Fuusion UltraFast™

A patented software-based virtual appliance that provides intelligent, self-tuning storage acceleration technology designed to address the real-world challenges of data transmission across dirty networks. Built on UDP, UltraFast substantially increases data transfer speeds compared to TCP/IP offering users the flexibility to securely transfer data to/from the cloud without requiring networking equipment updates or limiting data selected for transfer.

Frustrated by slow file transfers over FTP and TCP? Can’t afford IBM Aspera for bulk transfers? Need to transfer GB to TB size files over dirty, high latency networks with packet loss and can’t tolerate the painfully slow TCP transfer rates? Try UltraFast™.

Our patented UltraFast technology rapidly transfers data to and from the cloud over WAN networks where packet loss and disruption are common. This technology is a key component of our newly available product, Buurst Fuusion, an edge solution to connect business data from remote locations to powerful data processing cloud services regardless of your data’s original format.

In this post I review the latest testing conducted by Buurst with UltraFast over high latency networks.

UltraFast Research Highlights

This technology was designed to offer superior data transfer capabilities across low-quality networks such as WAN, Satellite and even local networks. With as little as 1% of packet loss UltraFast delivered between 997% (Satellite) to 92% (US – Local) faster data transfer based on the type of network and distance. The overall average data transmission increase of all regions at 1% packet loss was 644.14%. A packet loss rate of 2% averages 1,126.86% increase across all regions tested. These data points compare UltraFast to a basic TCP/IP transmission system.

These differences are significant as UltraFast typically delivered 10GB of data in less than 2 minutes with TCP/IP requiring hours in almost all cases. More details on the research are below.

Fuusion UltraFast™

A patented software-based virtual appliance that provides intelligent, self-tuning storage acceleration technology designed to address the real-world challenges of data transmission across dirty networks. Built on UDP, UltraFast substantially increases data transfer speeds compared to TCP/IP offering users the flexibility to securely transfer data to/from the cloud without requiring networking equipment updates or limiting data selected for transfer.

UltraFast

Latest UltraFast Testing Results

The Buurst engineering team recently conducted a series of TCP/IP vs UltraFast data transfer comparisons over a series of networks in various regions with different amounts of packet loss:

  • Satellite
  • US to Asia : Poor WAN
  • US to Europe : Typical WAN
  • US East to US West : Good WAN
  • US – Local

In each of these use cases; the % of packet loss is fixed from 10% to 0% with 8 different amounts tested for each region. For example, here is the use case for US to Europe:

Fuusion

As you can see, there is significant gain in delivering in transfer rates across networks with packet loss over 0.5% using UltraFast. For example, a 2% packet loss network will obtain a 1,275% better transfer rate.

Overall results for the regions demonstrated the significant improvement of transfer rates with higher packet loss and/or poor WAN connections. For complete results please email us.

Use Case

Next Steps

I encourage you to learn more about our data, cloud, and edge technology and contact Buurst for a demo:

Overview Fuusion – Connecting critical business data at the edge to a centralized cloud service independent of data format and network capabilities

Request a Demo – See how UltraFast impacts overall performance in our new edge data technology

Building an Edge Data Platform for Centralized AI/ML Processing

Building an Edge Data Platform for Centralized AI/ML Processing

Buurst Fuusion is a decentralized solution with components running at the edge, on a centralized cloud (AWS, Azure), and a data transfer accelerator in the network. Using a visual data flow tool build on Apache NiFi and data connector templates from the Fuusion Toolbox, customers can rapidly layout a complete data flow for delivery of information to an AI/ML solution for analysis and insight.

Fuusion Chart v3 with ultrafast_Fuusion

To get a better understanding of this new product, Buurst engineering recently recorded a 30 minute podcast on the L8istSh9y podcast community. This recording offers a behind the scenes look at Buurst Fuusion’s technology components: the open source Apache NiFi platform, challenges with edge data usage, and data transfer performance over a wide variety of networks.

A key component of Buurst Fuusion is our patented UltraFast technology designed to overcome significant network latency challenges that will certainly exist in edge deployments. This critical feature unlocks data for processing that would typically be considered to difficult to obtain for data flow processing.

For more information on this new product: