Every day, Data Scientists do complex things – moving on-premises data to the cloud, setting an AI/ML to do something with the data, and return a solution. Buurst Fuusion lets anyone be a data scientist by removing the complexity out of the collection, transformation, and results of any data.
Let’s take a behind-the-scenes look at this new product coming soon from Buurst. Fuusion is a tool to fuse on or off-cloud data with Cloud Data Services. (Fuusion overview blog posts) Cloud Data Service can be something like Rekognition or Azure Data Lakes.
Here’s How Fuusion Flows
Below is a flow to do video and image analysis using Fuusion. This flow collects a set of images, uploads the images to an S3 bucket, and triggers the AWS Rekognition service to run, then returns the results.
In Fuusion, data processing jobs are configured visually, as shown above. Buurst provides off-the-shelf flow templates for common use cases. Non-developers can readily customize these templates in minutes.
A Deeper Dive into the Fuusion Flow
Now, let’s walk you through the above flow in greater detail. The GetFile processor goes out and collects the configured Excel spreadsheet from a Windows file share called “Input.” The Excel control file is then converted into a JSON string and sent to our Rekognition processor in AWS. This processor manages the Rekognition job that runs on AWS as well as collecting the output of the job. Once the job is completed, the PutFile processor returns the results back to the Windows share in the Output folder. The data can be located anywhere – filesystem, SQL database, proprietary device, etc.
To make it easy to configure data location, AI service, and other options, these parameters can be simply configured using an Excel spreadsheet located in the user’s mapped drive/shared folder. Each user or department can create as many different unique jobs as desired.
You just fill in your Excel control file on the file share, Fuusion then identifies any new data files that show up in the configured Data Location path and automatically starts the flow. Once the flow is complete, the output will be delivered to your user-configured share folder.
This type of flow, along with flows that perform time series data forecasting, as well as enabling the use of Azure ML Studio and Power BI by collecting and data and converting it into a CDM schema inside of an Azure data lake, are all available in our Beta Fuusion release.