Amazon Timestream: Managed InfluxDB for Time Series Data
Amazon Timestream for InfluxDB goes into general availability today on Amazon Web Services (AWS).
The managed service for InfluxDB 2.x, a high cardinality, open source time-series database from InfluxData, considerably enhances AWS’ time series data capabilities and use cases.
Today also heralds the elevation of Timestream, which was formerly the name of AWS’ previously available time series database, to the name of the category for time series databases on the popular cloud services provider.
Similar to how Amazon RDS is the category for relational database services (including offerings like RDS for MariaDB, RDS for Oracle, and others), Timestream is the time series services category that contains Amazon Timestream for InfluxDB. LiveAnalytics is the new name for AWS’s database which was formerly called Timestream.
According to Brian Mullen, InfluxData CMO, the decision to position InfluxDB 2.x on AWS stemmed from a desire to deliver open source time series database functionality with the scalability, security, and reliability for which AWS is known.
“The modern way to consume open source is in the cloud,” Mullen observed. “Even if people are downloading open source Influx or Redis or some other thing, they’re typically deploying it on some instance they’re running in the cloud.”
Users can currently deploy InfluxDB 2.x as a managed service in AWS as a single instance, although customers can implement a service add-on (which Mullen said was akin to a hot standby multi-availability zone option) as a backup. The roadmap includes support for additional add-ons as well as a formal multiple availability zone option for the service.
At some point, InfluxData plans to make its InfluxDB 3.0 version available as part of AWS Timestream.
In the meantime, users can avail themselves of the substantial speed and performance of InfluxDB 2.x on AWS. The engine also supports a strong developer experience and ease of use to ingest, analyze, store, and downsample time series data for a growing number of use cases — from the Internet of Things to network monitoring and observability.
Speed and Performance
Amazon Timestream for InfluxDB runs directly within the AWS Management Console. The high cardinality time series database platform enables organizations to simultaneously account for “hundreds of thousands to millions of time series,” Mullen mentioned.
Such cardinality makes it possible to correlate different time series for end-user applications at a tremendous scale — such as for equipment asset monitoring and predictive maintenance in the Industrial Internet. The rate at which InfluxDB is able to ingest time-series data is also notable and aided by an open source collection agent, Telegraf, which simplifies the ingestion process.
Included as part of the 2.x release of the database, Telegraf has “300 plus plug-ins ranging from standard stuff like MPTT all the way to really arcane IoT stuff that one or two people use that they contributed,” Mullen remarked. “Because of the open source nature of InfluxDB, you have this really strong ecosystem of technical folks [working on] the product.”
Mullen described an IoT use case for manufacturing in which the organization “was previously using Postgres. After switching to InfluxDB, their performance went from maybe 20 to 30 thousand points per second to millions of points per second. That’s an order of magnitude change in terms of the amount of throughput of data points landing per second.”
Ease of Use, Developer Experience
The ease of use characterizing InfluxDB also pertains to its query experience. Although the 3.0 version includes native support for SQL and InfluxQL (a language based on the popular query language), the 2.x version currently available in Timestream relies on InfluxQL, which Mullen commented was “very similar to SQL”.
The open source database is also extensible, which is critical for facilitating data visualizations of what can be rapidly changing time series data. “Many people use us to pipe the data directly into something like Grafana,” Mullen revealed. “They can manage a Grafana offering within AWS as well, or people might be using their own visualization solution.” InfluxDB broadens the array of time-series use cases it enables by working on both high-fidelity and low-fidelity data.
The 2.x version supports inverted indexes and aggregate queries, which are influential for data retention and downsampling. Downsampling reduces the number of data points in a dataset, which is useful for decreasing storage costs or optimizing storage for new incoming data.
With InfluxDB, “People have the ability to downsample as they need, versus by default having to store everything as it comes in ” Mullen explained. “They have that flexibility to essentially keep the short-term stuff, that high fidelity, and the long-term stuff at your discretion to downsample as you need.” Optimal time-series databases are able to support both downsampling and long-term data storage, which allows them to “handle a lot more use cases based on this flexibility for the customer to create their own timeline on the retention,” Mullen said.
A Suitable Match
Developers utilizing Amazon Timestream for InfluxDB will readily access an open source time series database supporting high cardinality, ease of use, and performance gains that can account for a growing number of use cases. They’ll do so within the frequently sought-after confines of the hyperscale cloud provider, and its myriad functions for securing, scaling, and building applications. The pairing of these benefits has the potential to democratize the underlying utility of time series analytics, further entrenching it into the data landscape.