Obtaining and Storing Time-Series Data with Python
Requirements
This tutorial was executed on a macOS system with Python 3 installed via Homebrew. I recommend setting up additional tooling like virtualenv, pyenv, or conda-env to simplify Python and Client installations. Otherwise, the full requirements are here:
txt
influxdb-client=1.30.0
pandas=1.4.3
requests>=2.27.1
- Created a bucket. You can think of a bucket as a database or the highest hierarchical level of data organization within InfluxDB.
- Created a token.
Request Weather Data
First, we need to request our data. We’ll use the request library to return hourly weather data from a specified longitude and latitude with the OpenWeatherMap API.
python
# Get time series data from OpenWeatherMap API
params = {'lat':openWeatherMap_lat, 'lon':openWeatherMap_lon, 'exclude': "minutely,daily", 'appid':openWeatherMap_token}
r = requests.get(openWeather_url, params = params).json()
hourly = r['hourly']
Convert the Data to a Pandas DataFrame
Next, convert the JSON data to a Pandas DataFrame. We’ll also convert the timestamp from a Unix timestamp with second precision to a datetime object. This conversion is being made because the InfluxDB write method requires that the timestamp is in datetime object format. We’ll use this method next to write our data to InfluxDB. We also drop columns that we don’t want to write to InfluxDB.
python
# Convert data to Pandas DataFrame and convert timestamp to datetime object
df = pd.json_normalize(hourly)
df = df.drop(columns=['weather', 'pop'])
df['dt'] = pd.to_datetime(df['dt'], unit='s')
print(df.head)
Write the Pandas DataFrame to InfluxDB
python
# Write data to InfluxDB
with InfluxDBClient(url=url, token=token, org=org) as client:
df = df
client.write_api(write_options=SYNCHRONOUS).write(bucket=bucket,record=df,
data_frame_measurement_name="weather",
data_frame_timestamp_column="dt")
Full Script
To review, let’s take a look at the full script all together. We followed the following steps:- Import libraries
- Gather the following:
- InfluxDB bucket
- InfluxDB org
- InfluxDB token
- InfluxDB URL
- OpenWeatherMap URL
- OpenWeatherMap token
- Build your request.
- Convert the JSON response to a Pandas DataFrame.
- Drop any columns you don’t want to write to InfluxDB.
- Convert the timestamp column from unix time to a Pandas datetime object.
- Instantiate the InfluxDB Python Client Library.
- Write the DataFrame and specify the measurement name and timestamp column.
python
import requests
import influxdb_client
import pandas as pd
from influxdb_client import InfluxDBClient
from influxdb_client.client.write_api import SYNCHRONOUS
bucket = "OpenWeather"
org = "<my_InfluxDB_org>" # or email you used to create your Free Tier InfluxDB Cloud account
token = "<my_InfluxDB_token"
url = "<my_InfluxDB_url>" # for example, https://us-west-2-1.aws.cloud2.influxdata.com/
openWeatherMap_token = "<my_OpenWeatherMap_token>"
openWeatherMap_lat = "33.44"
openWeatherMap_lon = "-94.04"
openWeather_url = "https://api.openweathermap.org/data/2.5/onecall"
# Get time series data from OpenWeatherMap API
params = {'lat':openWeatherMap_lat, 'lon':openWeatherMap_lon, 'exclude': "minutely,daily", 'appid':openWeatherMap_token}
r = requests.get(openWeather_url, params = params).json()
hourly = r['hourly']
# Convert data to Pandas DataFrame and convert timestamp to datetime object
df = pd.json_normalize(hourly)
df = df.drop(columns=['weather', 'pop'])
df['dt'] = pd.to_datetime(df['dt'], unit='s')
print(df.head)
# Write data to InfluxDB
with InfluxDBClient(url=url, token=token, org=org) as client:
df = df
client.write_api(write_options=SYNCHRONOUS).write(bucket=bucket,record=df,
data_frame_measurement_name="weather",
data_frame_timestamp_column="dt")
Query the Data
Now that we’ve written our data to InfluxDB, we can use the InfluxDB UI to query for our data. Navigate to the Data Explorer (from the left-hand navigation bar). Use the Query Builder to select for the data that you want to visualize and the range that you want to visualize it for and then hit Submit.
A default materialized view of our weather data. InfluxDB automatically aggregates the time-series data so that new users don’t accidentally query for too much data and get a timeout.
Navigate to the Script Editor and uncomment or delete the aggregateWindow() function to view the raw weather data.