Once a file has been uploaded, you have to create a feed to ingest the data to the tenant’s data lake.


Contents


Before you start

  • Make sure your file has been uploaded to Peak.
    To learn how to do this, see Ad Hoc Uploads.
  • Check that your files are named with a timestamp.
    If data from the file is to be updated and fetched regularly as part of a feed, files must be named with a timestamp so that Peak loads them as part of the same feed.


Getting to the screens

To create a feed for a file:

  1. Go to Dock > Data Sources.
    The Feeds screen appears.
  2. Click ADD.
    The Choose Connector screen appears.
  3. Go to the File Storage section and click File Upload.
    The Create Feed screen appears.


Process Overview

There are three stages that need to be completed when creating a data feed for a file:

  • Import Configuration 
  • Destination
  • Trigger

To find out how to create new and edit existing data feeds, see Managing your data feeds.


Import Configuration

If you have already uploaded a file, either via the Peak interface or a signed URL, go to the File drop-down and select your file from the list.

If you haven’t already uploaded a file, click UPLOAD NEW.

For details of this process, see Ad Hoc Uploads.



After choosing your file, click NEXT and complete the fields. 

Once complete, click NEXT to move to the Destination stage.


Import Configuration Fields


File type

Choose the type of file: CSV, JSON or XML


CSV

Value separators can be:

  • Comma
  • Tab
  • Pipe


XML

  • Enter a value for the root tag 


Feed load type


Primary key (optional)

The primary key is only mandatory for an upsert feed.


Feed name

Enter a suitable name for the feed:

  • The name should be meaningful
  • Only alphanumeric and underscore is allowed.
  • It must start with a letter.
  • Must not end with an underscore.
  • Up to 50 characters allowed.



Destination

The Destination stage enables you to choose a where your data will be stored.


Choose a destination

The destination is where the customer data is stored by Peak. 

It can be either S3 (spark processing), Redshift, or both. 

S3 (Spark processing)

This is Amazon S3 data storage. 

Apache Spark is used by Peak to process large, unstructured (CSV) datasets on Amazon S3.


Redshift

This is Amazon Redshift data storage.

Data stored using Redshift can be queried using SQL. This makes it possible to run frequent aggregations on really large datasets.

Redshift is a relational database and any data that is fed into it has to map exactly - column by column.

Any failed rows are flagged and written to a separate table.


Failed row threshold

This is the number of failed rows that are acceptable before the feed is stopped.

The threshold should reflect the total number of rows that are being written to the table and what is an acceptable proportion of fails before the quality of the data could be considered compromised.


Changing the data type of a schema

When specifying the destination for a data connector, you can change the data type of your schema. 

This function is available for all connectors apart from Webhooks and the old agent based feeds.

Choose the required column name or field name and click the dropdown icon next to the Suggested Data Type. The following data types are available:

  • STRING
  • INTEGER
  • NUMERIC 
  • TIMESTAMP
  • DATE
  • BOOLEAN
  • JSON

Note:
In the current release, TIMESTAMPTZ is not supported.
Any data in this format will be ingested as a string by default.



Setting a trigger

From the Trigger stage, you can define triggers and watchers:

  • Triggers enable you to define when a data feed is run. 
  • Watchers can be added to feeds to provide notifications of feed events to Peak users or other systems.

Triggers

Triggers enable you to define when a data feed is run. There are three types of trigger:

  • Schedule trigger:
    Schedule when the feed runs. A basic and advanced (Cron) scheduler is available.
  • Webhook trigger:
    Trigger a feed to run via a webhook from another system.
  • Run Once trigger:
    Trigger the feed to run once at either a set time or manually from the feed list.

Basic Schedule Trigger

  • Basic schedules use days and time.
  • The feed will run on the selected days (blue).
  • Enter a suitable time or frequency for the tenant’s environment.

Advanced Schedule Trigger

  • Advanced schedules use Cron.
  • Enter the time / frequency as a Cron string.

Cron formatting

A cron expression is a string comprising 6 or 7 fields separated by a space.


FieldMandatoryAllowed ValuesAllowed Special Characters
SecondsYes0-59

, - * /

MinutesYes0-59

, - * /

HoursYes0-23

, - * /

Days of monthYes1-31

, - * ? / L W

MonthYes

1-12 or JAN-DEC

, - * /

Day of weekYes

1-7 or SUN-SAT

, - * ? / L #

YearNo

empty, 1970-2099

, - * /


Cron expression examples

ExpressionMeaning

0 0 12 * * ?

Trigger at 12pm (noon) every day

0 15 10 * * ? 2021

Trigger at 10:15am every day during the year 2021

0 15 10 ? * 6L

Trigger at 10:15am on the last Friday of every month


Webhook triggers

Webhook triggers are used to trigger a data feed when data on a separate system has been updated. 

Webhooks work in a similar way to regular APIs, but rather than making constant requests to other systems to check for updates, webhooks will only send data when a particular event takes place - in this case when new data is available for the data feed.

Using a the webhook URL

The webhook URL is generated by Peak and is unique to the data feed that you are creating or editing. The data source system needs the URL so that it knows where to send the notification.

  1. From the Trigger stage, click Webhook and copy the URL.
    If required, you can generate a new URL by clicking the curved arrow.
  2. Use the URL in the webhook section of the application that you want to receive data from.
    If the system is external to Peak, you will also need to provide it with an API Key for your tenant so that the webhook can be authenticated.
    For more information about generating API Keys, see API Keys.
  3. Once you have generated and copied your webhook URL, click SAVE.

Run Once Triggers

Run Once triggers are used to run the feed once at either a set time or manually from the feeds list.

From the Run Type drop-down menu, choose either:

  • Manual:
    This enables you to trigger the feed manually from the feeds list.
    To do this, go to Dock > Data Sources, hover over the feed and click ‘Run now’.
    For more information, see Managing your data feeds.
  • Date and Time:
    The feed will run once at the scheduled date and time.
    The time you enter must be at least 30 minutes from the current time.


Watchers

Watchers can be added to feeds to provide notifications of feed events to Peak users or other systems. 

There are two types of watcher:

  • User watcher:
    These are users of your tenant that will receive a notification within the platform if a feed event occurs.
  • Webhook watcher:
    These are used to trigger or send notifications to other systems or applications when a feed is updated.
    They could include external applications such as Slack or internal Peak functions such as Workflows.

To add a watcher:

  1. From the Trigger step screen, click ADD WATCHER.
  2. Choose either User or Webhook

 

User Watchers

These are users of your tenant that will receive a notification within the platform if a feed event occurs.

  1. To choose a tenant user to add as a watcher, click the Search User drop-down.
  2. Choose the data feed events that you want the user to be notified of.  
    You can choose to watch all or a custom selection.
    Once added, users can view notifications by clicking the bell icon at the top of the screen.

Data feed events

Users can be notified of the following data feed events:

  • Create:

  • Execution status:

  • Run fail:
    The feed run has failed.
  • Edit / delete:
    The feed has been edited or deleted.
  • Run success:
    The feed has successfully run
  • No new data:
    There is no new data available on a feed

 

Webhook Watchers

These are used to trigger or send notifications to other systems or applications when a feed is updated.
They could include external applications such as Slack or internal Peak functions such as Workflows.

The Webhook URL is taken from the application that you want to trigger if an event occurs. 

If this is a Peak Workflow, this can be taken from the workflow’s trigger step.

The JSON payload is optional. It can be used to pass variables to provide additional information about the feed. Parameters can include:

  • {tenantname}
  • {jobtype}
  • {jobname}
  • {trigger}


Data feed events

Webhooks can be configured for the following data feed events:

  • Run fail:
    The feed run has failed.
  • Run success:
    The feed has successfully run
  • Running for more than x minutes:
    The feed has been running for more than the specified time in minutes
  • No new data:
    There is no new data available on a feed