# Customizing Schema (fields & mappings)

### About this export

| Field | Value |
| --- | --- |
| **content_type** | lesson |
| **platform** | contentstack-academy |
| **source_url** | https://www.contentstack.com/academy/courses/data-insights-data-ingestion-profile-construction/data-insights-course-3--customizing-schema |
| **course_slug** | data-insights-data-ingestion-profile-construction |
| **lesson_slug** | data-insights-course-3--customizing-schema |
| **markdown_file_url** | /academy/md/courses/data-insights-data-ingestion-profile-construction/data-insights-course-3--customizing-schema.md |
| **generated_at** | 2026-04-28T06:55:44.147Z |

> Part of **[Data Ingestion & Profile Construction](https://www.contentstack.com/academy/courses/data-insights-data-ingestion-profile-construction)** on Contentstack Academy. **Academy MD v3** — structured for retrieval; no quiz or assessment keys.

<!-- ai_metadata: {"lesson_id":"04","type":"video","duration_seconds":540,"video_url":"https://cdn.jwplayer.com/previews/fGpn7GIn","thumbnail_url":"https://cdn.jwplayer.com/v2/media/fGpn7GIn/poster.jpg?width=720","topics":["Customizing","Schema","fields","mappings"]} -->

#### Video details

#### At a glance

- **Title:** 12-data-insights-custom-schema
- **Duration:** 9m
- **Media link:** https://cdn.jwplayer.com/previews/fGpn7GIn
- **Publish date (unix):** 1752872612

#### Streaming renditions

- application/vnd.apple.mpegurl
- audio/mp4 · AAC Audio · 113479 kbps
- video/mp4 · 180p · 180p · 140452 kbps
- video/mp4 · 270p · 270p · 157205 kbps
- video/mp4 · 360p · 360p · 174433 kbps
- video/mp4 · 540p · 540p · 224977 kbps

#### Timed text tracks (delivery)

- **thumbnails:** `https://cdn.jwplayer.com/strips/fGpn7GIn-120.vtt`

#### Transcript

The main thing that I want to cover in this session is how the fields and the mappings work and how their relationship is to stream. So a perfect example of that is we have a common schema, a bunch of the heavy lifting has already been done, we'll connect MailChimp in a little while and it automatically gets the fields and mappings, but it is inevitable that you're going to want to push custom data into Linux. So I built a very simple, also Game of Thrones themed CSV, where we're pulling in things like email and first name and last name, and we use the attribute field, which comes kind of pre-mapped. But then you run into this field where it's likelihood to rain, which is, you know, akin to say like a custom score that's coming out of your warehouse or whatever it may be. So this is a thing that isn't in the system, it's totally custom. We need to make sure that when we upload this data, whichever stream we upload it to, it's going to ultimately get mapped properly to that particular profile. So I wanted to walk through that example together anyway, so we can just start to do that. So if I were to upload this information right now with no mappings, you would see it in the streams. So I'm not going to do that, but you could ultimately upload this to whatever stream you would want. You could go over here and you would see the raw data in the stream that actually was received, but it's not going to have anywhere to go with it. So it's going to sit in the stream as sort of a metric. It's never going to get mapped to the profile. It's just going to sort of confuse you. So it's really, really important to know that streams to Eric's point are kind of just a construct to help you separate the logic for mapping that data to a particular field. So in our case, we want to add the field likelihood to rain. The way that I would start to do this, and there's a couple of different approaches, but because we know that this field is net new, you're going to first need to go create a field, right? So you need to put a place on the profile where this information can actually live. We don't want to use any of the common scheme of fields, so I'm going to go to create new. You first select an ID. This can be any sort of sluggified input, so I'm just going to paste likelihood to rain. We'll just call it that for now. We'll say, you know, custom score on like rain, which is really just surfaced in the information in the UI. We'll choose the data type. So in our case, it looks like it's an integer. We want to be able to do things like greater than less than in the segmentation engine. That's one really important point when you're choosing the data type. It impacts the way that you can actually leverage that particular field in segmentation. So for instance, if I were to make this a string, it would work, but when I go to build a segment, if I wanted to find anybody that was like greater than 50 on likelihood terrain, it's not going to work. It's not going to allow you to do that because it can't do a sort of numeric operation on a string. So those kind of things are really important to keep in mind as you're doing custom mapping. Obviously there's lots of different data types that you could use. There's arrays, there's string arrays, there's time arrays, there's maps, et cetera, et cetera. We're just going to choose an integer in this case. You can add an optional description. This is just purely for the UI, so we won't do that here. You could choose it as an identity key. If it's a string, you can't do that with an integer. We don't want to do that with likelihood terrain. If I were to accidentally do that, the net result would essentially be that anybody that had, say, like a 95 score of likelihood terrain would be merged into one super profile, which is definitely not what you would want to do. So that's why only clicking the identity key when it's absolutely necessary and you want to use it to merge data together is very, very important. For a merge operator, because it's an integer, you'll see that there's a number of different ways that you can ultimately handle this data. Maybe I want the total purchase amount over time. You could do the integer as a sum. As new events come in, it's going to continue to add those up so that you have this one value that represents total purchase over time. You could just count the events. So instead of having an understanding of, like, the specific number, it's just how many times I've seen this sort of field change. The max number, the min number, the latest, the oldest. In our case, we want it to just represent the actual score value, and we want it to be the latest value so that if a new one comes in tomorrow when we push this information, it gets updated and overridden. So we're going to do latest. And then we'll leave all of the other fields. They're all grayed out, actually, in this case, because you can't set them. If we were to set or choose an array or a map, you'd have some more options to Eric's kind of earlier point on the size limit and how long some of that information hangs around. The other thing that's really important is you can flag your fields as if they contain PII or not. Because a CDP is collecting user information, a lot of that information can be personally identified, which becomes really, really important. We have lots of controls in the UI to hide or show that information for particular users. In a lot of cases, our customers will maybe encrypt or encode or Base64 encode an email address to never expose that. So there's lots of controls that you'll see around privacy. This just lets you know the field may contain PII so that throughout the UI, we can put the proper controls in place. And then there's some categorization options that are totally optional. They just show up in the actual profile kind of filtering when you're exploring. So we'll just say that that's behavior for now. So this is how we're going to configure our field. If I hit create field, this essentially just gives it a place to live on the profile. But there's still no logic on how do I take the raw data that's coming in and map it to that particular profile, right? There's still that kind of gap. So that's where mappings come in. So if I go back to my field, likelihood terrain, I'll go to current mappings and I see that there's no mappings. So we'll hit create new mapping and you'll see three new concepts that we have not touched on yet. So well, two of the three we haven't touched on yet. So stream is going to say, where do I want to create this mapping to? I can choose one of the streams that exist, or we'll just do demo custom CSV. I'm going to copy this and put it over here so I don't forget that. And then you have an opportunity to do an expression or a condition. So what the expression does is there's a few different functions that we have. So you saw it like on email, as an example, where it's going to take it, it's going to lowercase it, it's going to do some basic validation. There's things for phone number to normalize. There's ways that you can split, you can uppercase, you can lowercase. There's a whole list of the different expressions in our documentation that we can kind of come back to. But this is one of the, I wouldn't call it a cleansing layer by any means, but it's one of the ways that you can manipulate some of the endpoints to make them consistent. In our case, we're just going to do likelihood to rain. So we just want to like take a direct translation, oops, I pasted the wrong value. That's not what we want. So all I want to do is I want to take the key that'll ultimately come in as likelihood to rain in our CSV. And I want to say anytime that that value comes in, we're going to, as long as it's on the demo custom CSV stream, we're going to map it up to that profile. The conditions we'll come back to, and I'll show you a few different examples, but conditions give you a way to say, like when we're mapping conversion events or events in general, if it's a particular type, then I want to map it in this way. So you can add essentially if else equals type logic to a particular mapping so that there's conditional logic, whether it should map that field or not, as opposed to just in this case, anytime that likelihood to rain shows up in this particular stream, it's going to map it to the field that I chose, which is this custom score likelihood to rain. So I'm going to create, oh yeah, go ahead. One comment here real quick. If this was like a untrusted source, where this data was coming from, you probably would want to put an expression in there that would validate that it's a number in the range of one to a hundred. But since this is a source that we generated that we know is clean the whole way through, we can just take the value straight from the CSV. Yep. And we'll pull up the docs and show some of like the conditions and expressions and whatnot, just so you know where it is. There's a whole bunch of things that you can do ultimately from a logic perspective, but try to keep it as simple as possible for this example. So I am going to create that mapping for that field.

#### Subtitles (WebVTT)

```webvtt
WEBVTT

1
00:00:00.000 --> 00:00:19.160
The main thing that I want to cover in this session is how the fields and the mappings

2
00:00:19.160 --> 00:00:21.060
work and how their relationship is to stream.

3
00:00:21.060 --> 00:00:25.920
So a perfect example of that is we have a common schema, a bunch of the heavy lifting

4
00:00:25.920 --> 00:00:29.600
has already been done, we'll connect MailChimp in a little while and it automatically gets

5
00:00:29.600 --> 00:00:34.100
the fields and mappings, but it is inevitable that you're going to want to push custom data

6
00:00:34.100 --> 00:00:35.100
into Linux.

7
00:00:35.100 --> 00:00:41.320
So I built a very simple, also Game of Thrones themed CSV, where we're pulling in things

8
00:00:41.320 --> 00:00:44.760
like email and first name and last name, and we use the attribute field, which comes kind

9
00:00:44.760 --> 00:00:45.900
of pre-mapped.

10
00:00:45.900 --> 00:00:49.640
But then you run into this field where it's likelihood to rain, which is, you know, akin

11
00:00:49.640 --> 00:00:53.520
to say like a custom score that's coming out of your warehouse or whatever it may be.

12
00:00:53.520 --> 00:00:57.300
So this is a thing that isn't in the system, it's totally custom.

13
00:00:57.300 --> 00:01:01.980
We need to make sure that when we upload this data, whichever stream we upload it to,

14
00:01:01.980 --> 00:01:05.800
it's going to ultimately get mapped properly to that particular profile.

15
00:01:05.800 --> 00:01:11.020
So I wanted to walk through that example together anyway, so we can just start to do that.

16
00:01:11.020 --> 00:01:17.980
So if I were to upload this information right now with no mappings, you would see it in

17
00:01:17.980 --> 00:01:18.980
the streams.

18
00:01:18.980 --> 00:01:23.100
So I'm not going to do that, but you could ultimately upload this to whatever stream

19
00:01:23.100 --> 00:01:24.380
you would want.

20
00:01:24.420 --> 00:01:29.580
You could go over here and you would see the raw data in the stream that actually was received,

21
00:01:29.580 --> 00:01:31.140
but it's not going to have anywhere to go with it.

22
00:01:31.140 --> 00:01:33.860
So it's going to sit in the stream as sort of a metric.

23
00:01:33.860 --> 00:01:35.780
It's never going to get mapped to the profile.

24
00:01:35.780 --> 00:01:37.380
It's just going to sort of confuse you.

25
00:01:37.380 --> 00:01:41.700
So it's really, really important to know that streams to Eric's point are kind of just a

26
00:01:41.700 --> 00:01:48.020
construct to help you separate the logic for mapping that data to a particular field.

27
00:01:48.020 --> 00:01:53.120
So in our case, we want to add the field likelihood to rain.

28
00:01:53.120 --> 00:01:55.800
The way that I would start to do this, and there's a couple of different approaches,

29
00:01:55.800 --> 00:02:00.840
but because we know that this field is net new, you're going to first need to go create

30
00:02:00.840 --> 00:02:01.840
a field, right?

31
00:02:01.840 --> 00:02:05.720
So you need to put a place on the profile where this information can actually live.

32
00:02:05.720 --> 00:02:09.920
We don't want to use any of the common scheme of fields, so I'm going to go to create new.

33
00:02:09.920 --> 00:02:11.320
You first select an ID.

34
00:02:11.320 --> 00:02:16.360
This can be any sort of sluggified input, so I'm just going to paste likelihood to rain.

35
00:02:16.360 --> 00:02:18.400
We'll just call it that for now.

36
00:02:18.400 --> 00:02:29.080
We'll say, you know, custom score on like rain, which is really just surfaced in the

37
00:02:29.080 --> 00:02:30.840
information in the UI.

38
00:02:30.840 --> 00:02:32.400
We'll choose the data type.

39
00:02:32.400 --> 00:02:35.000
So in our case, it looks like it's an integer.

40
00:02:35.000 --> 00:02:40.120
We want to be able to do things like greater than less than in the segmentation engine.

41
00:02:40.120 --> 00:02:43.720
That's one really important point when you're choosing the data type.

42
00:02:43.720 --> 00:02:48.560
It impacts the way that you can actually leverage that particular field in segmentation.

43
00:02:48.560 --> 00:02:52.440
So for instance, if I were to make this a string, it would work, but when I go to build

44
00:02:52.440 --> 00:02:58.320
a segment, if I wanted to find anybody that was like greater than 50 on likelihood terrain,

45
00:02:58.320 --> 00:02:59.320
it's not going to work.

46
00:02:59.320 --> 00:03:02.640
It's not going to allow you to do that because it can't do a sort of numeric operation on

47
00:03:02.640 --> 00:03:04.000
a string.

48
00:03:04.000 --> 00:03:10.520
So those kind of things are really important to keep in mind as you're doing custom mapping.

49
00:03:10.520 --> 00:03:14.760
Obviously there's lots of different data types that you could use.

50
00:03:14.760 --> 00:03:19.000
There's arrays, there's string arrays, there's time arrays, there's maps, et cetera, et cetera.

51
00:03:19.000 --> 00:03:21.960
We're just going to choose an integer in this case.

52
00:03:21.960 --> 00:03:23.200
You can add an optional description.

53
00:03:23.200 --> 00:03:26.140
This is just purely for the UI, so we won't do that here.

54
00:03:26.140 --> 00:03:28.200
You could choose it as an identity key.

55
00:03:28.200 --> 00:03:30.120
If it's a string, you can't do that with an integer.

56
00:03:30.120 --> 00:03:33.120
We don't want to do that with likelihood terrain.

57
00:03:33.120 --> 00:03:38.920
If I were to accidentally do that, the net result would essentially be that anybody that

58
00:03:38.920 --> 00:03:45.120
had, say, like a 95 score of likelihood terrain would be merged into one super profile, which

59
00:03:45.120 --> 00:03:47.160
is definitely not what you would want to do.

60
00:03:47.160 --> 00:03:51.720
So that's why only clicking the identity key when it's absolutely necessary and you want

61
00:03:51.720 --> 00:03:54.720
to use it to merge data together is very, very important.

62
00:03:54.720 --> 00:03:59.080
For a merge operator, because it's an integer, you'll see that there's a number of different

63
00:03:59.080 --> 00:04:01.720
ways that you can ultimately handle this data.

64
00:04:01.720 --> 00:04:05.880
Maybe I want the total purchase amount over time.

65
00:04:05.880 --> 00:04:07.280
You could do the integer as a sum.

66
00:04:07.280 --> 00:04:10.120
As new events come in, it's going to continue to add those up so that you have this one

67
00:04:10.120 --> 00:04:14.000
value that represents total purchase over time.

68
00:04:14.000 --> 00:04:15.400
You could just count the events.

69
00:04:15.400 --> 00:04:19.280
So instead of having an understanding of, like, the specific number, it's just how many

70
00:04:19.280 --> 00:04:22.280
times I've seen this sort of field change.

71
00:04:22.280 --> 00:04:24.280
The max number, the min number, the latest, the oldest.

72
00:04:24.280 --> 00:04:27.880
In our case, we want it to just represent the actual score value, and we want it to

73
00:04:27.880 --> 00:04:32.240
be the latest value so that if a new one comes in tomorrow when we push this information,

74
00:04:32.240 --> 00:04:33.520
it gets updated and overridden.

75
00:04:33.520 --> 00:04:35.680
So we're going to do latest.

76
00:04:35.680 --> 00:04:39.200
And then we'll leave all of the other fields.

77
00:04:39.200 --> 00:04:41.640
They're all grayed out, actually, in this case, because you can't set them.

78
00:04:41.640 --> 00:04:45.480
If we were to set or choose an array or a map, you'd have some more options to Eric's

79
00:04:45.480 --> 00:04:51.280
kind of earlier point on the size limit and how long some of that information hangs around.

80
00:04:51.280 --> 00:04:56.640
The other thing that's really important is you can flag your fields as if they contain

81
00:04:56.640 --> 00:04:58.240
PII or not.

82
00:04:58.240 --> 00:05:02.480
Because a CDP is collecting user information, a lot of that information can be personally

83
00:05:02.480 --> 00:05:04.420
identified, which becomes really, really important.

84
00:05:04.420 --> 00:05:09.860
We have lots of controls in the UI to hide or show that information for particular users.

85
00:05:09.860 --> 00:05:15.780
In a lot of cases, our customers will maybe encrypt or encode or Base64 encode an email

86
00:05:15.780 --> 00:05:17.220
address to never expose that.

87
00:05:17.220 --> 00:05:21.180
So there's lots of controls that you'll see around privacy.

88
00:05:21.180 --> 00:05:26.460
This just lets you know the field may contain PII so that throughout the UI, we can put

89
00:05:26.460 --> 00:05:27.940
the proper controls in place.

90
00:05:27.940 --> 00:05:32.140
And then there's some categorization options that are totally optional.

91
00:05:32.140 --> 00:05:36.500
They just show up in the actual profile kind of filtering when you're exploring.

92
00:05:36.500 --> 00:05:40.060
So we'll just say that that's behavior for now.

93
00:05:40.060 --> 00:05:43.220
So this is how we're going to configure our field.

94
00:05:43.220 --> 00:05:48.140
If I hit create field, this essentially just gives it a place to live on the profile.

95
00:05:48.140 --> 00:05:53.420
But there's still no logic on how do I take the raw data that's coming in and map it to

96
00:05:53.420 --> 00:05:54.900
that particular profile, right?

97
00:05:54.900 --> 00:05:56.780
There's still that kind of gap.

98
00:05:56.780 --> 00:05:58.480
So that's where mappings come in.

99
00:05:58.560 --> 00:06:04.240
So if I go back to my field, likelihood terrain, I'll go to current mappings and I see that

100
00:06:04.240 --> 00:06:06.280
there's no mappings.

101
00:06:06.280 --> 00:06:13.400
So we'll hit create new mapping and you'll see three new concepts that we have not touched

102
00:06:13.400 --> 00:06:14.400
on yet.

103
00:06:14.400 --> 00:06:16.840
So well, two of the three we haven't touched on yet.

104
00:06:16.840 --> 00:06:20.600
So stream is going to say, where do I want to create this mapping to?

105
00:06:20.600 --> 00:06:28.040
I can choose one of the streams that exist, or we'll just do demo custom CSV.

106
00:06:28.080 --> 00:06:34.000
I'm going to copy this and put it over here so I don't forget that.

107
00:06:34.000 --> 00:06:38.480
And then you have an opportunity to do an expression or a condition.

108
00:06:38.480 --> 00:06:42.600
So what the expression does is there's a few different functions that we have.

109
00:06:42.600 --> 00:06:45.960
So you saw it like on email, as an example, where it's going to take it, it's going to

110
00:06:45.960 --> 00:06:49.040
lowercase it, it's going to do some basic validation.

111
00:06:49.040 --> 00:06:50.840
There's things for phone number to normalize.

112
00:06:50.840 --> 00:06:54.280
There's ways that you can split, you can uppercase, you can lowercase.

113
00:06:54.280 --> 00:06:58.040
There's a whole list of the different expressions in our documentation that we can kind of come

114
00:06:58.040 --> 00:06:59.280
back to.

115
00:06:59.280 --> 00:07:04.320
But this is one of the, I wouldn't call it a cleansing layer by any means, but it's one

116
00:07:04.320 --> 00:07:08.200
of the ways that you can manipulate some of the endpoints to make them consistent.

117
00:07:08.200 --> 00:07:14.240
In our case, we're just going to do likelihood to rain.

118
00:07:14.240 --> 00:07:19.480
So we just want to like take a direct translation, oops, I pasted the wrong value.

119
00:07:19.480 --> 00:07:21.900
That's not what we want.

120
00:07:21.900 --> 00:07:26.300
So all I want to do is I want to take the key that'll ultimately come in as likelihood

121
00:07:26.300 --> 00:07:28.060
to rain in our CSV.

122
00:07:28.060 --> 00:07:31.780
And I want to say anytime that that value comes in, we're going to, as long as it's

123
00:07:31.780 --> 00:07:36.700
on the demo custom CSV stream, we're going to map it up to that profile.

124
00:07:36.700 --> 00:07:39.940
The conditions we'll come back to, and I'll show you a few different examples, but conditions

125
00:07:39.940 --> 00:07:46.820
give you a way to say, like when we're mapping conversion events or events in general, if

126
00:07:46.820 --> 00:07:49.740
it's a particular type, then I want to map it in this way.

127
00:07:49.740 --> 00:07:55.580
So you can add essentially if else equals type logic to a particular mapping so that

128
00:07:55.580 --> 00:07:59.980
there's conditional logic, whether it should map that field or not, as opposed to just

129
00:07:59.980 --> 00:08:05.260
in this case, anytime that likelihood to rain shows up in this particular stream, it's going

130
00:08:05.260 --> 00:08:11.200
to map it to the field that I chose, which is this custom score likelihood to rain.

131
00:08:11.200 --> 00:08:13.420
So I'm going to create, oh yeah, go ahead.

132
00:08:13.420 --> 00:08:14.980
One comment here real quick.

133
00:08:14.980 --> 00:08:21.260
If this was like a untrusted source, where this data was coming from, you probably would

134
00:08:21.260 --> 00:08:24.980
want to put an expression in there that would validate that it's a number in the range of

135
00:08:24.980 --> 00:08:27.120
one to a hundred.

136
00:08:27.120 --> 00:08:31.380
But since this is a source that we generated that we know is clean the whole way through,

137
00:08:31.380 --> 00:08:34.220
we can just take the value straight from the CSV.

138
00:08:34.220 --> 00:08:35.220
Yep.

139
00:08:35.220 --> 00:08:39.540
And we'll pull up the docs and show some of like the conditions and expressions and whatnot,

140
00:08:39.540 --> 00:08:40.780
just so you know where it is.

141
00:08:40.780 --> 00:08:45.940
There's a whole bunch of things that you can do ultimately from a logic perspective, but

142
00:08:45.940 --> 00:08:47.980
try to keep it as simple as possible for this example.

143
00:08:47.980 --> 00:08:51.460
So I am going to create that mapping for that field.

```

```transcript
<!-- PLACEHOLDER: replace with real transcript before publish if cues were auto-derived from WebVTT -->
[00:00] The main thing that I want to cover in this session is how the fields and the mappings
[00:19] work and how their relationship is to stream.
[00:21] So a perfect example of that is we have a common schema, a bunch of the heavy lifting
[00:25] has already been done, we'll connect MailChimp in a little while and it automatically gets
[00:29] the fields and mappings, but it is inevitable that you're going to want to push custom data
[00:34] into Linux.
[00:35] So I built a very simple, also Game of Thrones themed CSV, where we're pulling in things
[00:41] like email and first name and last name, and we use the attribute field, which comes kind
[00:44] of pre-mapped.
[00:45] But then you run into this field where it's likelihood to rain, which is, you know, akin
[00:49] to say like a custom score that's coming out of your warehouse or whatever it may be.
[00:53] So this is a thing that isn't in the system, it's totally custom.
[00:57] We need to make sure that when we upload this data, whichever stream we upload it to,
[01:01] it's going to ultimately get mapped properly to that particular profile.
[01:05] So I wanted to walk through that example together anyway, so we can just start to do that.
[01:11] So if I were to upload this information right now with no mappings, you would see it in
[01:17] the streams.
[01:18] So I'm not going to do that, but you could ultimately upload this to whatever stream
[01:23] you would want.
[01:24] You could go over here and you would see the raw data in the stream that actually was received,
[01:29] but it's not going to have anywhere to go with it.
[01:31] So it's going to sit in the stream as sort of a metric.
[01:33] It's never going to get mapped to the profile.
[01:35] It's just going to sort of confuse you.
[01:37] So it's really, really important to know that streams to Eric's point are kind of just a
[01:41] construct to help you separate the logic for mapping that data to a particular field.
[01:48] So in our case, we want to add the field likelihood to rain.
[01:53] The way that I would start to do this, and there's a couple of different approaches,
[01:55] but because we know that this field is net new, you're going to first need to go create
[02:00] a field, right?
[02:01] So you need to put a place on the profile where this information can actually live.
[02:05] We don't want to use any of the common scheme of fields, so I'm going to go to create new.
[02:09] You first select an ID.
[02:11] This can be any sort of sluggified input, so I'm just going to paste likelihood to rain.
[02:16] We'll just call it that for now.
[02:18] We'll say, you know, custom score on like rain, which is really just surfaced in the
[02:29] information in the UI.
[02:30] We'll choose the data type.
[02:32] So in our case, it looks like it's an integer.
[02:35] We want to be able to do things like greater than less than in the segmentation engine.
[02:40] That's one really important point when you're choosing the data type.
[02:43] It impacts the way that you can actually leverage that particular field in segmentation.
[02:48] So for instance, if I were to make this a string, it would work, but when I go to build
[02:52] a segment, if I wanted to find anybody that was like greater than 50 on likelihood terrain,
[02:58] it's not going to work.
[02:59] It's not going to allow you to do that because it can't do a sort of numeric operation on
[03:02] a string.
[03:04] So those kind of things are really important to keep in mind as you're doing custom mapping.
[03:10] Obviously there's lots of different data types that you could use.
[03:14] There's arrays, there's string arrays, there's time arrays, there's maps, et cetera, et cetera.
[03:19] We're just going to choose an integer in this case.
[03:21] You can add an optional description.
[03:23] This is just purely for the UI, so we won't do that here.
[03:26] You could choose it as an identity key.
[03:28] If it's a string, you can't do that with an integer.
[03:30] We don't want to do that with likelihood terrain.
[03:33] If I were to accidentally do that, the net result would essentially be that anybody that
[03:38] had, say, like a 95 score of likelihood terrain would be merged into one super profile, which
[03:45] is definitely not what you would want to do.
[03:47] So that's why only clicking the identity key when it's absolutely necessary and you want
```

#### Key takeaways

- Connect **Customizing Schema (fields & mappings)** back to your stack configuration before moving to the next module.
- Capture one concrete artifact (screenshot, Postman call, or code snippet) that proves the step works in your environment.
- Re-read the delivery versus management boundary for anything you changed in the entry model.

## Supplement for indexing

### Content summary

Customizing Schema (fields & mappings). Customizing Schema (fields & mappings) in Data Ingestion & Profile Construction (data-insights-data-ingestion-profile-construction).

### Retrieval tags

- Customizing
- Schema
- fields
- mappings
- data-insights-data-ingestion-profile-construction
- lesson 04
- Customizing Schema (fields & mappings)
- data-insights-data-ingestion-profile-construction lesson

### Indexing notes

Index this lesson as a primary chunk tagged with lesson_id "04" and topics: [Customizing, Schema, fields, mappings].
Parent course slug: data-insights-data-ingestion-profile-construction. Use asset_references URLs as thumbnail hints in search results when present.
Never surface LMS quiz content or assessment answers from this file.

### Asset references

| Label | URL |
| --- | --- |
| Video thumbnail: Customizing Schema (fields & mappings) | `https://cdn.jwplayer.com/v2/media/fGpn7GIn/poster.jpg?width=720` |

### External links

| Label | URL |
| --- | --- |
| Contentstack Academy home | `https://www.contentstack.com/academy/` |
| Training instance setup | `https://www.contentstack.com/academy/training-instance` |
| Academy playground (GitHub) | `https://github.com/contentstack/contentstack-academy-playground` |
| Contentstack documentation | `https://www.contentstack.com/docs/` |