# Intro to Data Collection

### About this export

| Field | Value |
| --- | --- |
| **content_type** | lesson |
| **platform** | contentstack-academy |
| **source_url** | https://www.contentstack.com/academy/courses/lytics-essentials/intro-to-data-collection |
| **course_slug** | lytics-essentials |
| **lesson_slug** | intro-to-data-collection |
| **markdown_file_url** | /academy/md/courses/lytics-essentials/intro-to-data-collection.md |
| **generated_at** | 2026-04-28T06:55:47.639Z |

> Part of **[Lytics Essentials](https://www.contentstack.com/academy/courses/lytics-essentials)** on Contentstack Academy. **Academy MD v3** — structured for retrieval; no quiz or assessment keys.

<!-- ai_metadata: {"lesson_id":"03","type":"video","duration_seconds":247,"video_url":"https://cdn.jwplayer.com/previews/zNbxjL4p","thumbnail_url":"https://cdn.jwplayer.com/v2/media/zNbxjL4p/poster.jpg?width=720","topics":["Intro","Data","Collection"]} -->

#### Video details

#### At a glance

- **Title:** Intro To Data Collection-Overview
- **Duration:** 4m 7s
- **Media link:** https://cdn.jwplayer.com/previews/zNbxjL4p
- **Publish date (unix):** 1750938967

#### Streaming renditions

- application/vnd.apple.mpegurl
- audio/mp4 · AAC Audio · 113731 kbps
- video/mp4 · 180p · 180p · 138672 kbps
- video/mp4 · 270p · 270p · 151287 kbps
- video/mp4 · 360p · 360p · 164889 kbps
- video/mp4 · 406p · 406p · 174567 kbps
- video/mp4 · 540p · 540p · 197253 kbps
- video/mp4 · 720p · 720p · 236161 kbps

#### Timed text tracks (delivery)

- **thumbnails:** `https://cdn.jwplayer.com/strips/zNbxjL4p-120.vtt`

#### Transcript

When it comes down to getting your data into Lytx, there are many, many ways, but really they come down to three primary types. One, web data coming from our JavaScript tag. Two, data coming from a variety of different APIs, whether it's a bulk import, a single event import, or something coming from an internal system, or most commonly, one of our pre-built integrations. With our pre-built integrations, we can pull data directly from your most common channel tools, from CSVs, from SFTPs, and virtually any other source that you can think of. Each collection method is slightly different, and they're all going to adhere to a few specific rules. One, data must always be directed at a stream. If a stream is not explicitly defined, it will land in the default stream. Streams represent unique sources of data. Each of these sources of data can have their own unique mapping rules, reducing much of the data prep overhead when sending your events to Lytx. And second, every event must contain some sort of identifier that can be used to associate the event with an individual user or human. This may be a web cookie, a user ID, an email, etc. Now that we know some of the rules around data collection, let's look at an example of how data can be collected from the web using our JavaScript tag. Going over to the Lytx.com website, we already have our tag installed. At the bottom of the screen, you'll see Chrome DevTools has already been opened up. This allows me to handwrite JavaScript code live in the browser to be ran. If I want to collect an event from the web, for instance, it's as simple as firing off what we call a JSTag.sendEvent. In this case, it's going to send the payload of test true, where the key is test and the value is true, and automatically pull in my UID, my browser information, and all sorts of relevant contextual data to associate with that event. If I hit enter, you'll see that event fire, and then we can actually go in closer and look at the exact information that was sent along with it. So you'll see again, my payload test true, timestamp, my device type, the UID for my particular user, this is what acts as the identifier, version of the JavaScript tag, and so on. Digging in deeper, if you go over to the Learn.Lytx documentation, you can see a variety of other API-based methods that we support. There's bulk CSV uploads, where you pass, say, a CSV file, and it processes row by row. There's individual JSON endpoints, where you can send a JSON payload, and so on. Regardless of the method for data collection, the approach is similar, in that there always must be an event level information, identifier, and stream. When verifying if your data has been collected, the easiest method is to go and look at the stream that you passed it to. Given that we collected our data from the web layer, it's going to automatically go to the default stream. We could have just as easily passed it to a custom stream, but we did not. On the Streams page, it'll give you a count of the last event received, which in many cases will show you when and if your event worked, and then you can dive in below to all of the raw events seen to make sure that the one that you're looking for is found. So if we go here and we do a refresh to get the latest data, and then search for test, we can see that there's a test field, and true, and it was actually seen on April 5th, so it's today, which is likely going to be our event. For more in-depth information on data collection... For more in-depth information and advanced use cases around data collection, please refer to our online documentation on Learn.lytics, or one of our more advanced courses on data mapping and collection. Thank you.

#### Subtitles (WebVTT)

```webvtt
WEBVTT

1
00:00:00.000 --> 00:00:05.720
When it comes down to getting your data into Lytx, there are many, many ways, but really

2
00:00:05.720 --> 00:00:08.680
they come down to three primary types.

3
00:00:08.680 --> 00:00:12.320
One, web data coming from our JavaScript tag.

4
00:00:12.320 --> 00:00:17.400
Two, data coming from a variety of different APIs, whether it's a bulk import, a single

5
00:00:17.400 --> 00:00:22.720
event import, or something coming from an internal system, or most commonly, one of

6
00:00:22.720 --> 00:00:23.720
our pre-built integrations.

7
00:00:23.720 --> 00:00:27.640
With our pre-built integrations, we can pull data directly from your most common channel

8
00:00:27.640 --> 00:00:35.160
tools, from CSVs, from SFTPs, and virtually any other source that you can think of.

9
00:00:35.160 --> 00:00:39.640
Each collection method is slightly different, and they're all going to adhere to a few specific

10
00:00:39.640 --> 00:00:40.640
rules.

11
00:00:40.640 --> 00:00:43.800
One, data must always be directed at a stream.

12
00:00:43.800 --> 00:00:48.960
If a stream is not explicitly defined, it will land in the default stream.

13
00:00:48.960 --> 00:00:51.560
Streams represent unique sources of data.

14
00:00:51.560 --> 00:00:56.320
Each of these sources of data can have their own unique mapping rules, reducing much of

15
00:00:56.320 --> 00:01:00.600
the data prep overhead when sending your events to Lytx.

16
00:01:00.600 --> 00:01:06.600
And second, every event must contain some sort of identifier that can be used to associate

17
00:01:06.600 --> 00:01:10.520
the event with an individual user or human.

18
00:01:10.520 --> 00:01:18.400
This may be a web cookie, a user ID, an email, etc.

19
00:01:18.400 --> 00:01:22.240
Now that we know some of the rules around data collection, let's look at an example

20
00:01:22.240 --> 00:01:27.840
of how data can be collected from the web using our JavaScript tag.

21
00:01:27.840 --> 00:01:32.000
Going over to the Lytx.com website, we already have our tag installed.

22
00:01:32.000 --> 00:01:35.520
At the bottom of the screen, you'll see Chrome DevTools has already been opened up.

23
00:01:35.520 --> 00:01:41.800
This allows me to handwrite JavaScript code live in the browser to be ran.

24
00:01:41.800 --> 00:01:45.860
If I want to collect an event from the web, for instance, it's as simple as firing off

25
00:01:45.860 --> 00:01:56.300
what we call a JSTag.sendEvent.

26
00:01:56.300 --> 00:02:00.900
In this case, it's going to send the payload of test true, where the key is test and the

27
00:02:00.900 --> 00:02:06.120
value is true, and automatically pull in my UID, my browser information, and all sorts

28
00:02:06.120 --> 00:02:09.900
of relevant contextual data to associate with that event.

29
00:02:09.900 --> 00:02:14.660
If I hit enter, you'll see that event fire, and then we can actually go in closer and

30
00:02:14.660 --> 00:02:17.420
look at the exact information that was sent along with it.

31
00:02:17.420 --> 00:02:23.700
So you'll see again, my payload test true, timestamp, my device type, the UID for my

32
00:02:23.700 --> 00:02:28.420
particular user, this is what acts as the identifier, version of the JavaScript tag,

33
00:02:28.420 --> 00:02:31.140
and so on.

34
00:02:31.140 --> 00:02:35.020
Digging in deeper, if you go over to the Learn.Lytx documentation, you can see a variety of other

35
00:02:35.020 --> 00:02:38.300
API-based methods that we support.

36
00:02:38.300 --> 00:02:44.220
There's bulk CSV uploads, where you pass, say, a CSV file, and it processes row by row.

37
00:02:44.220 --> 00:02:50.540
There's individual JSON endpoints, where you can send a JSON payload, and so on.

38
00:02:50.540 --> 00:02:54.620
Regardless of the method for data collection, the approach is similar, in that there always

39
00:02:54.620 --> 00:03:00.320
must be an event level information, identifier, and stream.

40
00:03:00.320 --> 00:03:03.580
When verifying if your data has been collected, the easiest method is to go and look at the

41
00:03:03.580 --> 00:03:05.420
stream that you passed it to.

42
00:03:05.420 --> 00:03:09.140
Given that we collected our data from the web layer, it's going to automatically go

43
00:03:09.140 --> 00:03:10.420
to the default stream.

44
00:03:10.420 --> 00:03:16.180
We could have just as easily passed it to a custom stream, but we did not.

45
00:03:16.180 --> 00:03:19.380
On the Streams page, it'll give you a count of the last event received, which in many

46
00:03:19.380 --> 00:03:24.020
cases will show you when and if your event worked, and then you can dive in below to

47
00:03:24.020 --> 00:03:27.500
all of the raw events seen to make sure that the one that you're looking for is found.

48
00:03:27.500 --> 00:03:37.780
So if we go here and we do a refresh to get the latest data, and then search for test,

49
00:03:37.780 --> 00:03:41.700
we can see that there's a test field, and true, and it was actually seen on April 5th,

50
00:03:41.700 --> 00:03:45.100
so it's today, which is likely going to be our event.

51
00:03:45.100 --> 00:03:50.300
For more in-depth information on data collection...

52
00:03:50.300 --> 00:03:54.040
For more in-depth information and advanced use cases around data collection, please refer

53
00:03:54.040 --> 00:03:59.260
to our online documentation on Learn.lytics, or one of our more advanced courses on data

54
00:03:59.260 --> 00:04:01.260
mapping and collection.

55
00:04:01.260 --> 00:04:03.180
Thank you.

```

```transcript
<!-- PLACEHOLDER: replace with real transcript before publish if cues were auto-derived from WebVTT -->
[00:00] When it comes down to getting your data into Lytx, there are many, many ways, but really
[00:05] they come down to three primary types.
[00:08] One, web data coming from our JavaScript tag.
[00:12] Two, data coming from a variety of different APIs, whether it's a bulk import, a single
[00:17] event import, or something coming from an internal system, or most commonly, one of
[00:22] our pre-built integrations.
[00:23] With our pre-built integrations, we can pull data directly from your most common channel
[00:27] tools, from CSVs, from SFTPs, and virtually any other source that you can think of.
[00:35] Each collection method is slightly different, and they're all going to adhere to a few specific
[00:39] rules.
[00:40] One, data must always be directed at a stream.
[00:43] If a stream is not explicitly defined, it will land in the default stream.
[00:48] Streams represent unique sources of data.
[00:51] Each of these sources of data can have their own unique mapping rules, reducing much of
[00:56] the data prep overhead when sending your events to Lytx.
[01:00] And second, every event must contain some sort of identifier that can be used to associate
[01:06] the event with an individual user or human.
[01:10] This may be a web cookie, a user ID, an email, etc.
[01:18] Now that we know some of the rules around data collection, let's look at an example
[01:22] of how data can be collected from the web using our JavaScript tag.
[01:27] Going over to the Lytx.com website, we already have our tag installed.
[01:32] At the bottom of the screen, you'll see Chrome DevTools has already been opened up.
[01:35] This allows me to handwrite JavaScript code live in the browser to be ran.
[01:41] If I want to collect an event from the web, for instance, it's as simple as firing off
[01:45] what we call a JSTag.sendEvent.
[01:56] In this case, it's going to send the payload of test true, where the key is test and the
[02:00] value is true, and automatically pull in my UID, my browser information, and all sorts
[02:06] of relevant contextual data to associate with that event.
[02:09] If I hit enter, you'll see that event fire, and then we can actually go in closer and
[02:14] look at the exact information that was sent along with it.
[02:17] So you'll see again, my payload test true, timestamp, my device type, the UID for my
[02:23] particular user, this is what acts as the identifier, version of the JavaScript tag,
[02:28] and so on.
[02:31] Digging in deeper, if you go over to the Learn.Lytx documentation, you can see a variety of other
[02:35] API-based methods that we support.
[02:38] There's bulk CSV uploads, where you pass, say, a CSV file, and it processes row by row.
[02:44] There's individual JSON endpoints, where you can send a JSON payload, and so on.
[02:50] Regardless of the method for data collection, the approach is similar, in that there always
[02:54] must be an event level information, identifier, and stream.
[03:00] When verifying if your data has been collected, the easiest method is to go and look at the
[03:03] stream that you passed it to.
[03:05] Given that we collected our data from the web layer, it's going to automatically go
[03:09] to the default stream.
[03:10] We could have just as easily passed it to a custom stream, but we did not.
[03:16] On the Streams page, it'll give you a count of the last event received, which in many
[03:19] cases will show you when and if your event worked, and then you can dive in below to
[03:24] all of the raw events seen to make sure that the one that you're looking for is found.
[03:27] So if we go here and we do a refresh to get the latest data, and then search for test,
[03:37] we can see that there's a test field, and true, and it was actually seen on April 5th,
[03:41] so it's today, which is likely going to be our event.
[03:45] For more in-depth information on data collection...
[03:50] For more in-depth information and advanced use cases around data collection, please refer
[03:54] to our online documentation on Learn.lytics, or one of our more advanced courses on data
[03:59] mapping and collection.
[04:01] Thank you.
```

#### Lesson text

## Overview

**Note:** On January 10, 2023, we upgraded our UI with a new, refreshed interface. All of the underlying functionality is the same, but you will notice that things look a little different from this Academy guide. The most notable change is that the navigation menu has moved from the top of the app to the left side. We appreciate your patience as we work on updating our Academy.

What will I learn?

*   What are the three main ways to get your data into Lytics:
    *   Lytics JavaScript tag
    *   Lytics APIs
    *   Pre-built integrations
*   What a "Data Stream" is and how to use it?
*   Things to avoid when collecting data if possible

In the "Overview" video, you'll be introduced to the ways to get your first-party user data into Lytics and validate its arrival. This section also cover some best practices on what kind of data you want to collect and what kind of data you may want to avoid.

### Match the collection method to the data sent to Lytics.

Lytics JavaScript Tag

Website data

Pre-built integrations

Bulk or custom imports, data from internal systems

Lytics APIs

Data from common marketing tools (ads, email, etc.)

**Regardless of the method, all data collection through Lytics requires which of the following?**

A. Event-level information

B. Identifier

C. Data Stream

D. All of the above

Answer: D

## Data Streams

All data sent to Lytics must be sent through a data stream.

**Data streams** are silos of **raw data** containing key-value pairs organized by source that can be used in the mapping of user fields.

![Sample\_Dream\_Streams.png](https://images.contentstack.io/v3/assets/bltebc53cfaf0dd6403/bltda638515d4c3784b/68627d85bf423e3db6dd662b/Sample_Dream_Streams.png)

Until the raw event of a data stream is mapped to a user field it will not be available for use to build audiences. This gives Lytics the ability to filter, aggregate, and merge raw data into user fields in a non-destructive way.

**What are Data Streams used for?**

The primary purpose is to verify that data is successfully being received by Lytics.

**Data streams contain \_\_\_\_\_\_\_ data organized by source.**

A. Raw

B. Mapped

**Unless configured otherwise, all web data is automatically sent to which stream?**

A. App stream

B. Default stream

C. Activity stream

D. Primary stream

At the end of the day, the Lytics platform is flexible and has no "right" way when it comes to being used. That said, there are some best practices we've learned along the way that can save you a ton of time and potentially money as you begin sending data to the Lytics platform.

*   **Always have a use case in mind.** How are you going to use the data? Almost 80% of user fields that are mapped across all accounts are never used. This results in your account having bloated or slower profiles and can generally be avoided by focusing on use cases and what data is required.

*   Send a **sample of data** to test mapping before sending the full payload of data.

*   Don't think of streams as different sources. Rather think of them as **various sources that** **share a schema**.

**Examples of user data:**

*   [Collecting data with the Lytics JavaScript tag](https://learn.lytics.com/documentation/product/features/lytics-javascript-tag/using-version-3/collecting-data#events)
*   [Collecting data via API](https://learn.lytics.com/documentation/developer/api-docs/data-upload)

**The majority of user fields are never used in marketing use cases.**

A. True

B. False

## Next Steps

We have just scratched the surface on data collection. Here are recommended resources to view next.

### Academy Courses

*   Connecting Integrations
*   Lytics Data Flow
*   Lytics JavaScript Tag

### Documentation

*   [Data Streams](https://learn.lytics.com/documentation/product/features/data-onboarding-and-management/data-streams)
*   [Onboarding Web Data](https://learn-preview.lytics.com/documentation/product/features/data-onboarding-and-management/onboarding-web-data)
*   [Integrated Marketing Tools](https://learn-preview.lytics.com/documentation/product/features/data-onboarding-and-management/integrated-marketing-tools)

#### Key takeaways

- Connect **Intro to Data Collection** back to your stack configuration before moving to the next module.
- Capture one concrete artifact (screenshot, Postman call, or code snippet) that proves the step works in your environment.
- Re-read the delivery versus management boundary for anything you changed in the entry model.

## Supplement for indexing

### Content summary

Intro to Data Collection. Overview Note: On January 10, 2023, we upgraded our UI with a new, refreshed interface. All of the underlying functionality is the same, but you will notice that things look a little different from this Academy guide. The most notable change is that the navigation menu has moved from the top of the app to the left side. We appreciate your patience as we work on updating our Academy. What will I learn? What are the three main ways to get your data into Lytics: Lytics JavaScript tag Lytics APIs Pre-built integrations What a "Data Stream" is and how to use it? Things to avoid when collecting data if possible In the "Overview" video, you'll be introduced to the ways to get your first-party user

### Retrieval tags

- Intro
- Data
- Collection
- lytics-essentials
- lesson 03
- Intro to Data Collection
- lytics-essentials lesson

### Indexing notes

Index this lesson as a primary chunk tagged with lesson_id "03" and topics: [Intro, Data, Collection].
Parent course slug: lytics-essentials. Use asset_references URLs as thumbnail hints in search results when present.
Never surface LMS quiz content or assessment answers from this file.

### Asset references

| Label | URL |
| --- | --- |
| Video thumbnail: Intro to Data Collection | `https://cdn.jwplayer.com/v2/media/zNbxjL4p/poster.jpg?width=720` |
| Sample\_Dream\_Streams.png | `https://images.contentstack.io/v3/assets/bltebc53cfaf0dd6403/bltda638515d4c3784b/68627d85bf423e3db6dd662b/Sample_Dream_Streams.png` |

### External links

| Label | URL |
| --- | --- |
| Contentstack Academy home | `https://www.contentstack.com/academy/` |
| Training instance setup | `https://www.contentstack.com/academy/training-instance` |
| Academy playground (GitHub) | `https://github.com/contentstack/contentstack-academy-playground` |
| Contentstack documentation | `https://www.contentstack.com/docs/` |
| Sample\_Dream\_Streams.png | `https://images.contentstack.io/v3/assets/bltebc53cfaf0dd6403/bltda638515d4c3784b/68627d85bf423e3db6dd662b/Sample_Dream_Streams.png` |
| Collecting data with the Lytics JavaScript tag | `https://learn.lytics.com/documentation/product/features/lytics-javascript-tag/using-version-3/collecting-data#events` |
| Collecting data via API | `https://learn.lytics.com/documentation/developer/api-docs/data-upload` |
| Data Streams | `https://learn.lytics.com/documentation/product/features/data-onboarding-and-management/data-streams` |
| Onboarding Web Data | `https://learn-preview.lytics.com/documentation/product/features/data-onboarding-and-management/onboarding-web-data` |
| Integrated Marketing Tools | `https://learn-preview.lytics.com/documentation/product/features/data-onboarding-and-management/integrated-marketing-tools` |
