What is a dataset?

A dataset is a database that is used as an intermediary data source between Voleer automations and an application. The dataset is a cache of an information subset which allows high performance queries in order to process automations.

 

mceclip0.png

 

In the above diagram, Zendesk is rate limited. In addition you will need to make multiple API calls to obtain enough objects to correlate with one another. With Voleer, we are able to retrieve all objects into different tables in which you can issue a single join query to obtain the data within a single call.

Datasets are useful in many different scenarios.

High Latency or Low Bandwidth

The network route between the system and where the automation is running can affect how quickly data can be retrieved. Latency is the time it takes to travel between these two points.

There are many factors that affect latency but generally the physical distance between these two datacenters determine the amount of latency.

 

In our Wi-Fi world, the internet still depends on undersea cables

 

The systems in which we are accessing data from may also have low bandwidth connectivity, in addition some SaaS providers throttle the volume of data that can be retrieved within a period of time. 

Voleer pulls data into it's local datacenter for local processing which has high connectivity.

API Limitations or Throttling

As there are no standards to return data, many systems lack the ability to retrieve a subset of data in a performant way. A lot of systems require pulling down all the data as they lack support for query based API's. 

Furthermore, in order to prevent denial of service (DoS) attacks, systems implement API throttling. These throttling limits restrict the ability to pull large quantities of data from these systems.

Once the data is pulled into a Voleer dataset, we expose a native database query language which allows rich functionality.

High Performance Queries

We are saving more and more data as storage costs continue to decrease over time. With large volumes of data, processing this amount may not be feasible. The source systems may not be able to retrieve and return such large datasets.

Because Voleer stores the data natively into a database, the query abilities are performant and scalable for large datasets without throttling. These databases are dedicated and isolated per connected system per customer.

Historical Data

In systems that store a large amount of data, historical data is most likely automatically archived and purged in order for those systems to be responsive.

Datasets allow connecting to system and continue downloading of new data while preserving old data within the dataset. 

Voleer offers scalable datasets which can handle large amounts of data in a performant and reliable way.

BI Integration

It's not possible to connect BI software to most SaaS applications. 

Voleer exposes raw database access that most BI platforms are able to connect to.

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.