Administrator

How do I set up and manage Data streams in Buzz?

  • Updated:
    info_outline
    Created:

A Data stream is a process to transmit Buzz system events (formatted as JSON objects) that can be used for data analysis, statistical tracking, or synchronization with another system.

In Buzz, Administrators can use different Stream types and configure multiple Data streams to allow data for various Events to be sent, near real-time, to one or more third-party services.

Use Buzz's Data streams to collect, analyze, process, and—ultimately—leverage data to:

  • Streamline processes.
  • Complete statistical research and compliance reporting.
  • Improve student experience and teacher effectiveness.

Stream types

Buzz is set up to deliver data using the following Stream types:

  • Amazon Kinesis Data Firehose: Amazon Kinesis Data Firehose is an extract, transform, and load (ETL) service that reliably captures, transforms, and delivers streaming data to data lakes, data stores, and analytics services.
  • Amazon Kinesis Data Stream: Amazon Kinesis Data Streams is a serverless streaming data service.
  • Amazon Simple Queue Service (SQS): Amazon Simple Queue Service (SQS) lets you send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available.
  • HTTPS: HTTPS streaming allows data to be sent continuously to a client over a single HTTPS connection that remains open indefinitely.

You are able to set up additional Data streams by clicking the Add stream button.

Event types

The Data stream can send various events related to domain, course, user, enrollment data, and more. To learn more about all of the available events and how you can filter your stream based upon these events, see How do I filter Data streams to send specific Events and Properties?

Configure Data streams with Amazon Kinesis Data Firehose

Configuring Data streams with Amazon Kinesis Data Firehose requires working in both Buzz and Amazon:

Amazon Kinesis Data Firehose Limitations: Review to make sure Firehose is right for you

A single Firehose may not be a suitable solution for customers wanting to get large numbers of notifications.

The API and Task servers send datastream events in realtime without any buffering on our side, and Firehose has a limit of 2,000 requests per second, which is not adjustable without human interaction.

If you plan to ever hit a burst of more than 2,000 data stream notifications per second, Firehose is probably not a good option for you.

Firehose is, however, able to easily send data to various persistent storage systems, so if your needs fall within it's throughput limits, and you need to store the event notifications long-term, Firehose is a good choice.

Event notifications may retry a few times (based on AWS SDK retry policies), but will not retry outside ot that, so if limits are exceeded, event notifications may be lost.

Sending different types of Data Stream event notifications to different firehoses may help this problem to a certain extent.

Configure Amazon to use Kinesis Data Firehose in Buzz

When setting up your Amazon Kinesis Data Firehose account for Buzz's API servers to write into, you need to configure cross-account access.

To configure Amazon: Amazon Kinesis Data Firehose configuration

Configure Buzz for Amazon Kinesis Data Firehose

To configure Data streams in Buzz using Amazon Kinesis Data Firehose:

  1. Open the More menu in Admin > Domain.
  2. Select Data streams.
  1. Provide a Description of the Data stream. The Description is for your use and should be a brief descriptor of what the Data stream is for and/or how you're using it.
    • Example: If you're setting up a Data stream to deliver all new domain data to an admin dashboard, your Description might be AdminDashboard_NewDomains_Firehose.
  2. Select Amazon Kinesis Data Firehose as your Stream type.
  3. Provide the Stream name.
    • This is an ASW identifier provided by AWS; it is the name of the Kinesis Data Firehose Delivery Stream you want to put event records into.
  4. Provide your ARN role.
    • You can find you ARN (Amazon Resource Name) role through AWS.
  5. Check the Enabled box to ensure the Data stream begins working when you Save.
    • If you're not ready to enable the stream, you can leave the box unchecked and Save the configuration, and the Data stream will not begin sending Events.
  6. Click Add filter if you want to limit the amount of data sent to meet your specific needs and optimize storage.
  7. Once you're done configuring the Data stream, you can:
    • Click Test to make sure the Data stream is set up correctly and working. This sends a single event, so you can verify that your configuration is correct.
    • Click Save. If the Enabled box is checked, your Data stream begins sending data; if not, your configuration is saved, and no data is sent.

Note: If you Enable and Save your Data stream without adding filters, Buzz automatically sends data for all Events to the defined destination. This can result in large amounts of unnecessary storage.

Configure Data streams with Amazon Kinesis Data Stream

Configuring Data streams with Amazon Kinesis Data Stream requires working in both Buzz and Amazon:

Amazon Kinesis Data Stream Limitations: Review to make sure Kinesis Data Stream is right for you

A Kinesis Data Stream with a single shard may not be a suitable solution for customers wanting to get large numbers of notifications.

The API and Task servers send data stream events in realtime without any buffering on our side, and Kinesis Data Streams are limited to 1,000 records and 1MB per second (writes).

Accounts are limited to 500 shards by default, so 500 shards would provide a maximum burst capability of 500,000 data stream notifications per second (assuming the notifications average less than 1KB each).

Messages are retained for 24 hours by default but up to 365 days if configured.

Kinesis Data Streams appear to be a kind of middle-ground between Firehose and SQS, with reduced throughput but longer message retention than SQS, and increased throughput but shorter retention than Firehose.

Event notifications may retry a few times (based on AWS SDK retry policies), but will not retry outside of that, so if limits are exceeded, event notifications may be lost.

Configure Amazon to use Kinesis Data Stream in Buzz

When setting up your Amazon Kinesis Data Stream account for Buzz's API servers to write into, you need to configure cross-account access.

To configure Amazon: Amazon Kinesis Data Stream configuration

Configure Buzz to use Amazon Kinesis Data Stream

To configure Data streams in Buzz using Amazon Kinesis Data Stream:

  1. Open the More menu in Admin > Domain.
  2. Select Data streams.
  1. Provide a Description of the Data stream. The Description is for your use and should be a brief descriptor of what the Data stream is for and/or how you're using it.
    • Example: If you're setting up a Data stream to deliver all new domain data to an admin dashboard, your Description might be AdminDashboard_NewDomains_DataStream.
  2. Select Amazon Kinesis Data Stream as your Stream type.
  3. Provide the Stream name.
    • This is an ASW identifier provided by AWS; it is the name of the Kinesis Data Stream you want to put event records into.
  4. Provide your ARN role.
    • You can find you ARN (Amazon Resource Name) role through AWS.
  5. Check the Enabled box to ensure the Data stream begins working when you Save.
    • If you're not ready to enable the stream, you can leave the box unchecked and Save the configuration, and the Data stream will not begin sending Events.
  6. Click Add filter if you want to limit the amount of data sent to meet your specific needs and optimize storage.
  7. Once you're done configuring the Data stream, you can:
    • Click Test to make sure the Data stream is set up correctly and working. This sends a single event, so you can verify that your configuration is correct.
    • Click Save. If the Enabled box is checked, your Data stream begins sending data; if not, your configuration is saved, and no data is sent.

Note: If you Enable and Save your Data stream without adding filters, Buzz automatically sends data for all Events to the defined destination. This can result in large amounts of unnecessary storage.

Configure Data streams with Amazon Simple Queue Service (SQS)

Configuring Data streams with Amazon SQS requires working in both Buzz and Amazon:

Amazon SQS Limitations: Review to make sure SQS is right for you

FIFO SQS Queues may not be a suitable solution, as they are limited to 300 transactions per second, which is likely to be exceeded in most use cases.

Standard SQS Queues, according to the documentation as of 2023-04-05 "support a nearly unlimited number of API calls per second, per API action," so they are ideal for high throughput data streaming, but only retain events for four days (this can be increased up to a maximum of 14 days), so you will need a process on your side to process these events and store them in some other system if you want to keep them for more than a few days.

Event notifications may retry a few times (based on AWS SDK retry policies), but will not retry outside of that, so if limits are exceeded, event notifications may be lost.

Configure Amazon to use SQS in Buzz

When setting up your Amazon SQS account for Buzz's API servers to write into, you need to configure cross-account access.

To configure Amazon: Amazon Simple Queue Service (SQS) configuration

Configure Buzz to use Amazon SQS

To configure Data streams in Buzz using Amazon Kinesis Data Stream:

  1. Open the More menu in Admin > Domain.
  2. Select Data streams.
  1. Provide a Description of the Data stream. The Description is for your use and should be a brief descriptor of what the Data stream is for and/or how you're using it.
    • Example: If you're setting up a Data stream to deliver all new domain data to an admin dashboard, your Description might be AdminDashboard_NewDomains_SQS.
  2. Select Amazon Simple Queue Service as your Stream type.
  3. Provide the Stream name.
    • This is an ASW identifier provided by AWS; it is the name of the SQS Data Stream you want to put event records into.
  4. Provide your ARN role.
    • You can find you ARN (Amazon Resource Name) role through AWS.
  5. Check the Enabled box to ensure the Data stream begins working when you Save.
    • If you're not ready to enable the stream, you can leave the box unchecked and Save the configuration, and the Data stream will not begin sending Events.
  6. Click Add filter if you want to limit the amount of data sent to meet your specific needs and optimize storage.
  7. Once you're done configuring the Data stream, you can:
    • Click Test to make sure the Data stream is set up correctly and working. This sends a single event, so you can verify that your configuration is correct.
    • Click Save. If the Enabled box is checked, your Data stream begins sending data; if not, your configuration is saved, and no data is sent.

Note: If you Enable and Save your Data stream without adding filters, Buzz automatically sends data for all Events to the defined destination. This can result in large amounts of unnecessary storage.

Configure Data streams with HTTPS

To configure Data streams in Buzz using Amazon Kinesis Firehose and Data stream:

  1. Open the More menu in Admin > Domain.
  2. Select Data streams.
  1. Provide a Description of the Data stream. The Description is for your use and should be a brief descriptor of what the Data stream is for and/or how you're using it.
    • Example: If you're setting up a Data stream to deliver all new domain data to an admin dashboard, your Description might be AdminDashboard_NewDomains_SISName.example.com.
  2. Select HTTPS.
    • HTTPS streaming allows data to be sent continuously to a client over a single HTTPS connection that remains open indefinitely.
  3. Provide a Stream name. This is an internal unique identifier for this stream and should be an abbreviated reference to the URL(s) that are receiving the data.
    • Example: If you are sending the data to a student information system, you might use SISName.example.com_AdminDashboard.
  4. Provide the:
    • Timeout seconds, or the amount of time you want to allow for the HTTPS connection.
    • Retries, or the number of retries you want to allow the HTTPS to try to connect.  
  5. Provide the Endpoints, or URL(s) to which the data is being sent. You can have up to five Endpoints separated by semicolons, each acting as a backup. Your Data stream attempts to send data to each Endpoint in order until one is successfully reached.
  6. Select the HTTP method: POST or PUT
  7. Check the Enabled box to ensure the Data stream begins working when you Save.
    • If you're not ready to enable the stream, you can leave the box unchecked and Save the configuration, and the Data stream will not begin sending Events..
  8. Click Add filter if you want to limit the amount of data sent to meet your specific needs and optimize storage.
  9. Once you're done configuring the Data stream, you can:
    • Click Test to make sure the Data stream is set up correctly and working. This sends a single event, so you can verify that your configuration is correct.
    • Click Save. If the Enabled box is checked, your Data stream begins sending data; if not, your configuration is saved, and no data is sent.

Note: If you Enable and Save your Data stream without adding filters, Buzz automatically sends data for all Events to the defined destination. This can result in large amounts of unnecessary storage.

Learn more: How do I filter Data streams to send specific Events and Properties?

Filter your Data streams to send only the data you need

Because Data streams automatically send data that you will be storing, Buzz lets you filter the data by Events and Properties within those Events to avoid using storage for data you don't need.

Learn more: How do I filter Data streams to send specific Events and Properties?

Interpret and map Data stream content for your use

Data streams deliver data as Event records. Each Event record is always wrapped in a standard container, which you can see in the following example (all the top-level members are part of the standard container):

{
	"time":"2022-06-29T16:40:30.9926931Z",
	"guid":"c7a820f6-36ed-48f6-8210-3581238d30a6",
	"domainId":"1176022",
	"type":"DomainEntityChanged",
	"data":{... },
	"userId":"93275",
	"agentUserId":"2239",
	"sessionId":"8655695467559064423",
}
Click to copy

Note: The above record is formatted for readability, but when delivered each record in the data stream appears as a single line with minimal whitespace.

The following table defines each member of the standard container.

Member Description
time The UTC time of the Event.
guid A GUID that uniquely identifies the action, even across Data streams and domains. 
domainId The ID of the domain where the Event occurred. 
type The type of the Event, which will identify the schema used for the data Property.
data An object with details about the Event that is different for each Event type
userId The ID of the authorized user who performed the action (may be missing if the action was performed internally by the system itself). If a user was proxying another user, this is the ID of the user being proxied. 
agentUserId The ID of the agent user who performed the action (may be missing if the action was performed internally by the system itself). If a user was proxying another user, this is the ID of the user acting as proxy. 
sessionId The ID of the user session that made the request (may be missing if the action was performed internally by the system itself). 
forum

Have a question or feedback? Let us know over in Discussions!