Zuar Runner Job: Subscriptions¶
Introduction¶
A subscriptions
table can exist on any host. Each row in the table defines a single
subscription to a report that will be created by performing a Tableau Export
. Reports
(also referred to as exports
) can be created in CSV, PDF, or XLSX formats. Among other
things, a subscription is defined by:
The name of a Tableau “view” that specifies the source of the data in the report
Optional JSON data consisting of key/value pairs that can be used to filter the data used to create the export
The frequency at which the subscription will be processed (daily, weekly, or monthly)
An optional day of the week on which to process the subscription
The time of day to process the subscription
A timezone associated with the time of day
One or more email addresses that will receive the report
Each time the Subscription Job
runs on a Zuar Runner instance, it records the time at which
the job was started. It then inspects each record in the subscriptions
table to
determine if the subscription should have run between the time of the last run and the
current time. If the subscription was defined to run during that interval, a Tableau
export preformed to create the report and the is emailed to the recipients.
Otherwise, the subscription is ignored.
Important
The Subscriptions Job processes all subscriptions defined in the subscriptions table – only one instance of a Subscriptions Job should be active on a Zuar Runner instance.
Tip
Because subscriptions can be created with an hourly frequency, the Subscriptions Job should usually be scheduled to run a minimum of once an hour. If it is desired to have report creation and emailing that closely matches the time specified in the subscriptions table, the Subscriptions Job should be scheduled to run once a minute.
Reports¶
A report may contain either a single view or all the views in a workbook. PDF reports can be extensively customized for scaling, sizing, etc. CSV and XLSX reports can not be customized.
PDF Reports¶
PDF reports are created using Tableau’s Vizql API. If a report contains multiple views, all views are assembled into a single PDF file.
CSV Reports¶
CSV reports are created using Tableau’s REST API. If a report contains a single view, the report that is created is a single CSV file. If a report contains multiple views, a ZIP archive is created. Archives contain one CSV file per view. File names within the archive are a combination of the workbook and view name with a CSV extension.
XLSX Reports¶
XLSX reports (Excel workbooks) are created using Tableau’s REST API. The workbook contains one worksheet per view. Sheet names within the workbook are a combination of the workbook and view name.
Job Configuration¶
Reference: Subscriptions job configuration
A Subscription Job’s configuration comprises three sections: subscriptions
, mail
,
and tableau
. The subscriptions
and mail
sections work the same, no matter the
type of report being exported. The tableau
section section contains Tableau server
credentials
as well as defaults that are used for export jobs. Values provided by
subscriptions always override default values that may be present in the tableau
section.
We’ll begin by discussing the sections. The discussion is not exhaustive; rather it is intended to provide an overall feel for how the Subscriptions job works, in general. Detailed descriptions of each configurable parameter can be found in the job’s schema documentation. The following Subscription job configuration will be used as an example.
{
"subscriptions": {
/* Defines the location of 'subscriptions' table */
"subscriptions_input": {
/* Optional section controls relative dates */
"relative_dates": {
"start_datetime": "global_start_datetime",
"end_datetime": "global_end_datetime",
"datetime_format": "%Y-%m-%d %H:%M:%S.%f"
},
"dbo": "postgresql://localhost/mitto",
"schema_name": "public",
"table_name": "subscriptions"
},
/* Defines the location of 'subscriptions_status' table */
"subscription_status_input": {
"dbo": "postgresql://localhost/mitto",
"schema_name": "public",
"table_name": "subscription_status"
}
},
/* Parameters for sending exports via email */
"mail": {
/* Defines the SMTP server to send mail with */
"server": "smtp.zuar.com",
"port": 25,
"require_tls": true,
/* SMTP server credentials (named credentials supported) */
"credentials": {
"username": "operations@zuar.com",
"password": "<password>"
},
/* Defines the email used to send exported reports */
"html": null,
"mail_bcc": null,
"mail_cc": null,
"mail_from": "operations@zuar.com",
"mail_subject": "default_mail_subject",
"text": "email body text",
"attachments": []
},
/* Defines Tableau server related parameters */
"tableau": {
/* Tableau server credentials (named credentials are supported) */
"credentials": {
"password": "<password>",
"username": "<username>"
},
/* Tableau server configuration */
"server": {
"server": "https://tableau.zuar.com",
"site": "",
"user_id_to_impersonate": "5cf3542a-0c46-42fb-b6a5-27a96e41054a"
},
/* default parameters to be used in creating CSV export s */
"export_csv": {
"name": "subscription_content",
"type": "csv"
},
/* default parameters to be used in creating XLSX export s */
"export_xlsx": {
"name": "subscription_content",
"type": "xlsx"
},
/* default parameters to be used in creating PDF export s */
"export_pdf": {
"log_responses": false,
"modifications": [
{
"action": "update",
"destination": "files",
"items": {
"clientDimension": "{\"w\": 1280, \"h\": 1024}",
"dashboardPortSize": "{\"w\": 1280, \"h\": 1024}",
"worksheetPortSize": "{\"w\": 1280, \"h\": 1024}"
},
"request": "bootstrap"
}
],
"name": "subscription_content",
"options": {
"imageHeight": "0",
"imageWidth": "0",
"pageFitHorizontal": "1",
"pageFitVertical": "1",
"pageOrientationOption": "landscape",
"pageScaleMode": "auto",
"pageScalePercent": "100",
"pageSizeOption": "letter"
},
"log_requests": false,
"save_responses": true,
"type": "pdf"
}
}
}
"subscriptions"
Section¶
"subscriptions_input"
Section¶
This defines the location of the subscriptions
table. The subscriptions
table must
have the following columns:
id
- A UUID that identifies the subscription.user_id
- A UUID belonging to a user on the Tableau server. When the export is run, the user associated with this UUID is impersonated.email
- One or more comma-separated email addresses to which the report will be emailed.view_name
- Theview name
to export.view_name
should be in the format<workbook_curl>/<view_curl>
, where<workbook_curl>
is the_content_url
of a workbook on the server and<view_curl>
is the_content_url
of a view in the workbook.schedule
- Defines how often the subscription is to run. Values are:DAILY
,WEEKLY
, andMONTHLY
.day_of
- Defines the day on which the subscription is to run.send_at
- Defines the time at which the subscription is to run (format:HH:MM:SS
).tz
- Is the timezone associated withsend_at
. This must be a timezone tz database format. E.g.:America/Chicago
.full
- Iftrue
, all sheets in the Tableau workbook are included in the export.<view_curl>
is ignored whenfull
istrue
.json_data
- One or more Tableau “view filters” to apply to the view when creating the export.Relative Dates
are supported injson_data
. See the following section.type
- The type of export to create:pdf
,csv
, andxlsx
.orientation
- The page orientation for PDF exports:landscape
andportrait
. The value is ignored for CSV or XLSX exports.
The Subscriptions job only reads data from this table; it never modifies it.
The following DDL will create a properly matched subscriptions
table:
create table subscriptions
(
id uuid not null
constraint subscriptions_pkey
primary key,
user_id uuid,
email varchar(256) not null,
view_name varchar(256) not null,
schedule varchar(8) not null,
day_of integer not null,
send_at time not null,
tz varchar(256) not null,
"full" boolean not null,
json_data text not null,
updated_at timestamp with time zone not null,
created_at timestamp with time zone not null,
type varchar(8) default 'pdf'::character varying,
orientation varchar(16) default 'landscape'::character varying
);
alter table subscriptions owner to mitto;
Relative Dates¶
subscriptions.json_data
may contain a date_range
key and value. The value of the key is a string containing
a relative date specification. The date_range
key and value are converted into view
filters specifying a start and end date. For example, if a subscription runs on
2020-01-12 and json_data
contains:
"date_range": "YESTERDAY"
the view filters created from json_data
will contain:
"global_start_datetime": "2010-01-11 00:00:00.000000",
"global_end_datetime": "2020-01-11 23:59:59.999999",
global_start_date
and global_end_date
are the default view filter keys used. These
defaults can be overridden via the relative_dates
section of the job configuration.
The following values can be used with date_range
:
|
---|
THIS_HOUR |
THIS_DAY |
THIS_WEEK |
THIS_MONTH |
THIS_QUARTER |
THIS_HALF |
THIS_YEAR |
NEXT_HOUR |
NEXT_n_HOUR |
NEXT_DAY |
NEXT_n_DAY |
NEXT_WEEK |
NEXT_n_WEEK |
NEXT_MONTH |
NEXT_n_MONTH |
NEXT_QUARTER |
NEXT_n_QUARTER |
NEXT_HALF |
NEXT_n_HALF |
NEXT_YEAR |
NEXT_n_YEAR |
LAST_HOUR |
LAST_n_HOUR |
LAST_DAY |
LAST_n_DAY |
LAST_WEEK |
LAST_n_WEEK |
LAST_MONTH |
LAST_n_MONTH |
LAST_QUARTER |
LAST_n_QUARTER |
LAST_HALF |
LAST_n_HALF |
LAST_YEAR |
LAST_n_YEAR |
TODAY |
TOMORROW |
YESTERDAY |
If a value contains _n_
, the n
represents any positive integer
number. For example: LAST_18_MONTH
, NEXT_7_DAY
, etc.
LAST_1_DAY
is equivalent to YESTERDAY
. LAST_0_DAY
,
NEXT_0_DAY
, and TODAY
are equivalent.
Modifying Relative Date Behavior¶
As described in the previous section global_start_datetime
and global_end_datetime
are the default keys and the default value format contains microseconds. These can be
controlled with the optional relative_dates
section of the job configuration.
For example, the following section:
"relative_dates": {
"start_datetime": "begin",
"end_datetime": "end",
"datetime_format": "%Y-%m-%d %H:%M:%S.%F"
},
the view filters created from json_data
will contain:
"begin": "2010-01-11 00:00:00.000",
"end": "2020-01-11 23:59:59.999",
The datetime_format
string is a Python strftime
format string that is used to format
a Python datetime.datetime
object. Note that the %F
format code in the above
example is not a standard Python strftime
code; it is a special code implemented by
Zuar Runner to support Tableau’s use of microseconds instead of milliseconds for some datetime
values. If the %F
code is present it must be the two final characters of the
datetime_format
string.
References:
"subscriptions_status_input"
Section¶
This defines the location of the subscriptions_status
table. The
subscriptions_status
is used to record the results of runs of the Subscriptions job.
"mail"
Section¶
The mail
sections controls how email is sent using SMTP and its content.
The following parameters in the mail
section, if present, will override the
value in the associated column of the subscriptions
table:
|
|
---|---|
mail_to |
tableau
Section¶
The tableau
section provides extensive control over the requests that are used
to create the exported reports. It is identical to and serves the same purpose as the
Export job configuration. Refer to the Export job
for more information.
Tableau Credentials¶
Tableau credentials can be embedded in the job configuration via:
"credentials": {
"password": "<password>",
"username": "<username>"
},
Alternatively, existing named credentials can be referenced:
"credentials": "zuar-tableau-creds",
The credentials used can be either those of a Tableau Server admin or those of a non-admin user.
If the credentials are those of a Tableau Server administrator:
User-id impersonation can be used, if desired
The user-id to impersonate can be provided either via the
user_id
column of thesubscriptions
table or viatableau.server.user_id_to_impersonate
in the job configuration.If the
user_id
column of thesubscriptions
table andtableau.server.user_id_to_impersonate
in the job configuration are both<null>
/None
, user-id impersonation is not performed.
If the credentials are not those of a Tableau Server administrator:
All values in the
user_id
column of thesubscriptions
table andtableau.server.user_id_to_impersonate
in the job configuration must be<null>
/None
.
Tableau Export¶
It is likely that most users will accept the default values for the export
session.
If present in the job configuration, the values provided will override the defaults.
The following parameters in the export
section, if present, will override
values in the associated column of the subscriptions
table:
|
|
---|---|
full |
full |
view_filters |
json_data |
workbook |
workbook_name |
view |
view_name |
type |
type |
If present in the export
section, workbook
and view
will be combined to create a
replacement for the view_name
value in the subscriptions
table.
Minimal Configuration¶
The initial job configuration shown earlier contains the builtin default values for many parameters for illustrative purposes. A smaller, simpler job configuration that uses defaults for most parameters follows:
{
"mail": {
"server": "smtp.zuar.com",
"port": 25,
"require_tls": true,
"credentials": "zuar-smtp-creds",
"mail_from": "operations@zuar.com",
"text": "email body text",
"attachments": []
},
"subscriptions": {
"subscriptions_input": {
"dbo": "postgresql://localhost/analytics",
"schema_name": "public",
"table_name": "subscriptions"
},
"subscription_status_input": {
"dbo": "postgresql://localhost/mitto",
"schema_name": "public",
"table_name": "subscription_status"
}
},
"tableau": {
"credentials": "zuar-tableau-creds",
"server": {
"server": "https://tableau.zuar.com",
"site": ""
}
}
}
Testing / Debugging¶
As an aid to testing and debugging, a Subscription Job can be run from the command-line in the following manner (
--current-time
is optional):python3 job_subscription.py \ --current-time 'YYYY-MM-DD HH:MM' \ /var/mitto/conf/example.json`
This runs the Subscription Job setting the current time to the time provided.
A
--noupdate
option is available. When present, it prevents thesubscription_status
table from being updated with the status of a run.
Warning
Use of –current-time updates the last checked time for all subscriptions. If this is used on a production system, the last_checked column in the subscription_status table should be reset to its previous value. This can be done manually or by re-running job_subscription.py –current-time option using the previous value.
The
mail.text
parameter supports expansion of the{debug}
template. This template will be replaced by the contents of the subscription that created the email and the export it contains as well as the actualexport
configuration used to create the export.
.