WEBVTT

1
00:00.000 --> 00:02.960
In this video, we get a chat about datasets.

2
00:02.960 --> 00:07.200
And a really good analogy for a dataset is thinking about walking into a

3
00:07.200 --> 00:07.800
restaurant.

4
00:07.800 --> 00:08.800
Let's do that.

5
00:08.800 --> 00:12.870
Let's imagine you and I are walking into a restaurant and we get seated and we

6
00:12.870 --> 00:13.640
probably

7
00:13.640 --> 00:14.960
want to order some food.

8
00:14.960 --> 00:17.320
So how do we know what's available?

9
00:17.320 --> 00:18.600
Well we have a menu.

10
00:18.600 --> 00:20.680
And then from that menu, what would we do?

11
00:20.680 --> 00:24.960
We would select certain information or certain details from that menu.

12
00:24.960 --> 00:29.980
And that is a very close analogy regarding a dataset in the world of 40 analy

13
00:29.980 --> 00:30.720
zer as we're
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

13
00:29.980 --> 00:30.720
zer as we're

14
00:30.720 --> 00:33.120
pulling information from the SQL database.

15
00:33.120 --> 00:36.520
So the SQL database would represent everything that's possibly on the menu.

16
00:36.520 --> 00:40.810
And what we're doing is we're selecting specific parts or pieces from that SQL

17
00:40.810 --> 00:41.600
database.

18
00:41.600 --> 00:44.950
Now also at the restaurant, going back to the menu analogy, we may have some

19
00:44.950 --> 00:45.640
restrictions

20
00:45.640 --> 00:46.640
on our diets.

21
00:46.640 --> 00:50.800
For example, we may not be tolerant to gluten or we may not be able to have

22
00:50.800 --> 00:51.280
nuts.

23
00:51.280 --> 00:52.600
We may have a nut allergy or something.

24
00:52.600 --> 00:57.180
So in addition to selecting from the menu, we'd also want to do some filtering

25
00:57.180 --> 00:57.760
and say,

26
00:57.760 --> 01:02.380
you know what, if there are any nuts, I'll go ahead and put a no symbol there,
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

26
00:57.760 --> 01:02.380
you know what, if there are any nuts, I'll go ahead and put a no symbol there,

27
01:02.380 --> 01:03.120
no nuts.

28
01:03.120 --> 01:06.230
That would also be a consideration regarding things we're going to order from

29
01:06.230 --> 01:06.800
the menu.

30
01:06.800 --> 01:09.680
Or perhaps there's some element that they can actually remove.

31
01:09.680 --> 01:12.670
So as a filter, we could say no nuts and that would also restrict what we're

32
01:12.670 --> 01:13.560
going to get.

33
01:13.560 --> 01:18.840
So the dataset is all about choosing or selecting data and pulling it from the

34
01:18.840 --> 01:20.080
40 analyzer

35
01:20.080 --> 01:24.210
that would then go into our report and then based on the charts and the other

36
01:24.210 --> 01:24.760
layouts

37
01:24.760 --> 01:28.940
inside of that report, that data from the 40 analyzer SQL database based on the

38
01:28.940 --> 01:29.320
select

39
01:29.320 --> 01:31.440
statement would populate that report.
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

39
01:29.320 --> 01:31.440
statement would populate that report.

40
01:31.440 --> 01:35.040
And again, in the world of Fortinet, they refer to that as a dataset.

41
01:35.040 --> 01:38.790
Again, behind the scenes, it's just a select statement with considerations

42
01:38.790 --> 01:39.080
against the

43
01:39.080 --> 01:40.080
SQL database.

44
01:40.080 --> 01:44.820
And also if we're using macros, think of that like a subset of the datasets,

45
01:44.820 --> 01:45.600
little mini

46
01:45.600 --> 01:51.360
me versions of datasets looking for very refined, small pieces of information.

47
01:51.360 --> 01:54.570
And because they can be predefined, we can pull and use them over and over

48
01:54.570 --> 01:55.080
again.

49
01:55.080 --> 01:56.800
So here's an example of a dataset.

50
01:56.800 --> 02:01.190
Let's imagine that we need to create a custom dataset, although there's lots
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

50
01:56.800 --> 02:01.190
Let's imagine that we need to create a custom dataset, although there's lots

51
02:01.190 --> 02:01.880
and lots of

52
02:01.880 --> 02:03.240
them already in place.

53
02:03.240 --> 02:05.890
But let's imagine we want to create a wrong and let's imagine we want to look

54
02:05.890 --> 02:06.760
for and pull

55
02:06.760 --> 02:08.160
from the log files.

56
02:08.160 --> 02:10.720
So the way we represent log files is a dollar sign log.

57
02:10.720 --> 02:13.320
And then we use a select statement says pull from the log.

58
02:13.320 --> 02:16.200
Now the log isn't literally the log anymore.

59
02:16.200 --> 02:20.460
It's pulling from the SQL database in a database table that has been built

60
02:20.460 --> 02:21.280
based on the log

61
02:21.280 --> 02:23.680
information that came in to the 40 analyzer.

62
02:23.680 --> 02:28.490
So the dataset is 100% all built on pulling data from the SQL database, not

63
02:28.490 --> 02:29.240
from raw log

64
02:29.240 --> 02:30.240
files.
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

64
02:29.240 --> 02:30.240
files.

65
02:30.240 --> 02:33.250
Then we could put a filter in place like narrowing it down saying you know what

66
02:33.250 --> 02:33.720
, instead of just

67
02:33.720 --> 02:38.590
pulling everything from the log tables, let's go ahead and say filter where the

68
02:38.590 --> 02:39.240
category

69
02:39.240 --> 02:42.640
description equals let's go ahead and say proxy avoidance.

70
02:42.640 --> 02:45.570
And when we put in the syntax, we're going to put an open and close single

71
02:45.570 --> 02:46.080
quote there

72
02:46.080 --> 02:48.480
because we're going to put a little space between proxy and avoidance.

73
02:48.480 --> 02:51.160
And that way it knows we want to look for that exact category description.

74
02:51.160 --> 02:54.660
So the single open quote and the single end quote here, that tells the select

75
02:54.660 --> 02:55.200
statement

76
02:55.200 --> 02:58.110
as part of the SQL database and what we're looking for, we're looking for just

77
02:58.110 --> 02:58.720
the category

78
02:58.720 --> 03:00.920
description of proxy avoidance.
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

78
02:58.720 --> 03:00.920
description of proxy avoidance.

79
03:00.920 --> 03:03.860
And then we could group the upcoming information based on source IP and host

80
03:03.860 --> 03:04.520
name and other

81
03:04.520 --> 03:05.640
elements as well.

82
03:05.640 --> 03:09.530
So when we build a report behind the scenes, it's also including a dataset that

83
03:09.530 --> 03:10.040
's pulling

84
03:10.040 --> 03:13.880
in the correct information so it can populate the charts and tables and so

85
03:13.880 --> 03:14.600
forth in the

86
03:14.600 --> 03:15.600
final report.

87
03:15.600 --> 03:19.470
And for individuals who are comfortable, which is not the majority of people on

88
03:19.470 --> 03:20.200
the planet,

89
03:20.200 --> 03:25.100
but for people who are comfortable with SQL with SQL language and the actual

90
03:25.100 --> 03:25.760
syntax for

91
03:25.760 --> 03:29.850
SQL databases, they're going to have a much easier time in creating their own

92
03:29.850 --> 03:31.120
custom datasets
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

92
03:29.850 --> 03:31.120
custom datasets

93
03:31.120 --> 03:34.480
if they already understand the SQL queries and how those commands are

94
03:34.480 --> 03:35.360
structured.

95
03:35.360 --> 03:38.580
For those people who are new to SQL queries, it's not going to be as easy, but

96
03:38.580 --> 03:38.920
the good

97
03:38.920 --> 03:43.850
news is this probably it don't have to create too many custom datasets, which

98
03:43.850 --> 03:44.560
equate to

99
03:44.560 --> 03:49.620
custom queries because on the 40 analyzer, there's tons and tons of datasets

100
03:49.620 --> 03:50.200
already

101
03:50.200 --> 03:52.240
predefined, ready to rock and roll.

102
03:52.240 --> 03:56.160
So to help reinforce the concept of a dataset and how it's used, let's go back

103
03:56.160 --> 03:56.800
to the 40

104
03:56.800 --> 03:57.800
analyzer once again.

105
03:57.800 --> 03:58.880
And here we go.

106
03:58.880 --> 04:02.960
So back at the 40 analyzer, we'll back down to reports, expand that.
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

106
03:58.880 --> 04:02.960
So back at the 40 analyzer, we'll back down to reports, expand that.

107
04:02.960 --> 04:07.730
And with report definition selected, we have all reports, templates, chart

108
04:07.730 --> 04:08.680
library, macro

109
04:08.680 --> 04:12.120
library, and then on the far right here, datasets.

110
04:12.120 --> 04:16.120
So currently with this flavor of the 40 analyzer, there's close to 2000.

111
04:16.120 --> 04:21.450
Look at the bottom right hand corner there close to 2000 pre built, ready to go

112
04:21.450 --> 04:22.280
datasets.

113
04:22.280 --> 04:26.970
So here with all these predefined datasets is showing the log type, for example

114
04:26.970 --> 04:27.520
, where

115
04:27.520 --> 04:30.720
the data would be coming from, so also showing over here, origin, in which
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

115
04:27.520 --> 04:30.720
the data would be coming from, so also showing over here, origin, in which

116
04:30.720 --> 04:31.360
means these are

117
04:31.360 --> 04:32.560
all built in.

118
04:32.560 --> 04:33.560
We also have options here.

119
04:33.560 --> 04:36.830
If we click on view options, we can go ahead and say, I only want to see custom

120
04:36.830 --> 04:37.240
, for

121
04:37.240 --> 04:40.970
example, and we don't have any custom datasets yet, or we can say, I only want

122
04:40.970 --> 04:41.560
to see the

123
04:41.560 --> 04:43.880
built in ones, and they'll do it here.

124
04:43.880 --> 04:46.560
And we have options for just seeing the 40 guard ones as well.

125
04:46.560 --> 04:48.480
So I'll go ahead and say, let's show all of them.

126
04:48.480 --> 04:51.600
And as an example, let's go ahead and grab one.

127
04:51.600 --> 04:56.960
So how about this one right here, 360 security applications by bandwidth.

128
04:56.960 --> 05:01.210
Now we hover there, it's actually showing us the select statement that it's
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

128
04:56.960 --> 05:01.210
Now we hover there, it's actually showing us the select statement that it's

129
05:01.210 --> 05:01.760
using with

130
05:01.760 --> 05:04.520
all of its conditions regarding that dataset.

131
05:04.520 --> 05:07.080
So think heavens, if we want to use that, we wouldn't have to create our own.

132
05:07.080 --> 05:08.960
We could use this dataset.

133
05:08.960 --> 05:12.830
Or if we go to another one, let's scroll down a little bit, how about right

134
05:12.830 --> 05:13.400
here?

135
05:13.400 --> 05:15.400
App risk virus discovered.

136
05:15.400 --> 05:19.760
If we hover here and the pop up, it's showing us the actual SQL statement.

137
05:19.760 --> 05:22.000
So if we double click, we also can take a closer look.

138
05:22.000 --> 05:25.400
So here's the actual query that would bring all that back in.

139
05:25.400 --> 05:28.210
And if we're creating our own, one of the other cool things is we can go ahead

140
05:28.210 --> 05:28.640
and click

141
05:28.640 --> 05:33.060
on validate, and it was just verified that our SQL query is correct and not
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

141
05:28.640 --> 05:33.060
on validate, and it was just verified that our SQL query is correct and not

142
05:33.060 --> 05:33.880
missing any

143
05:33.880 --> 05:34.880
parts.

144
05:34.880 --> 05:36.680
There's also an option here to analyze query.

145
05:36.680 --> 05:38.480
So here's the original query.

146
05:38.480 --> 05:40.400
Again, this is the default one that came with it.

147
05:40.400 --> 05:44.000
Here's the transform SQL and it says, well, you know what, the one we gave you

148
05:44.000 --> 05:44.520
is pretty

149
05:44.520 --> 05:45.520
darn good.

150
05:45.520 --> 05:46.520
We don't need to transform it.

151
05:46.520 --> 05:48.160
And then there's an H-cache query list.

152
05:48.160 --> 05:52.440
And what an H-cache query is, it's a query that's been done in advance.

153
05:52.440 --> 05:56.360
So when you run a report, if it's already cached, the query is already cached,

154
05:56.360 --> 05:56.920
the report can

155
05:56.920 --> 05:58.360
be generated a lot faster.

156
05:58.360 --> 06:03.140
So again, this query is perfect because it was provided by 40 analyzer, by Fort
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

156
05:58.360 --> 06:03.140
So again, this query is perfect because it was provided by 40 analyzer, by Fort

157
06:03.140 --> 06:03.720
inet.

158
06:03.720 --> 06:05.400
So we'll click on cancel there.

159
06:05.400 --> 06:09.300
And just as an example of creating a data set, if we needed to, probably one of

160
06:09.300 --> 06:09.760
the best

161
06:09.760 --> 06:14.420
options would be to right click one that comes close to what you need, clone it

162
06:14.420 --> 06:15.000
, and then

163
06:15.000 --> 06:16.280
go ahead and customize it.

164
06:16.280 --> 06:19.280
Or if you want to create one from scratch, we can do that as well.

165
06:19.280 --> 06:22.120
With data set selected, click here on create new.

166
06:22.120 --> 06:26.800
We'll call our data set looking for proxy avoidance.

167
06:26.800 --> 06:30.170
So for the log type, we'll click the drop down here and we'll go down and we'll
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

167
06:26.800 --> 06:30.170
So for the log type, we'll click the drop down here and we'll go down and we'll

168
06:30.170 --> 06:30.640
select

169
06:30.640 --> 06:34.990
web filter, because that's what I want to look for, for the web category of

170
06:34.990 --> 06:36.280
proxy avoidance.

171
06:36.280 --> 06:38.640
And then here we would put in our SQL query.

172
06:38.640 --> 06:41.600
So let's have some fun with this.

173
06:41.600 --> 06:45.560
I'm going to type in select, which means go ahead and pull the data.

174
06:45.560 --> 06:47.120
So I want to pull from source IP.

175
06:47.120 --> 06:50.480
You can feel like a column of data in the SQL database.

176
06:50.480 --> 06:54.610
I want to pull from source IP column and the host name column and also the

177
06:54.610 --> 06:55.760
category description

178
06:55.760 --> 06:56.760
column.

179
06:56.760 --> 06:59.370
So for this next statement, I'm going to put in select source IP, host name,

180
06:59.370 --> 06:59.920
and category

181
06:59.920 --> 07:00.920
description.
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

181
06:59.920 --> 07:00.920
description.

182
07:00.920 --> 07:03.320
But I have to tell where we're pulling from.

183
07:03.320 --> 07:05.240
So here I'll type in from space.

184
07:05.240 --> 07:09.780
And then I want to go ahead and pull from log, which is a table in the SQL

185
07:09.780 --> 07:10.640
database.

186
07:10.640 --> 07:14.940
So think of it like I want to look at or get the source IP information, host

187
07:14.940 --> 07:16.280
name information,

188
07:16.280 --> 07:19.550
category description information, think of those like columns from this

189
07:19.550 --> 07:20.320
database called

190
07:20.320 --> 07:21.320
log.

191
07:21.320 --> 07:25.240
And if we hover here, it's also showing us all the various fields available.

192
07:25.240 --> 07:28.760
So if we sorted through all those, we would find source IP, host name, and

193
07:28.760 --> 07:30.160
category description
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

193
07:28.760 --> 07:30.160
category description

194
07:30.160 --> 07:31.160
as three of those.

195
07:31.160 --> 07:33.200
So for readability, I'll press enter.

196
07:33.200 --> 07:37.310
And then let's go ahead and put a filter in place because we don't want to see

197
07:37.310 --> 07:37.560
all the

198
07:37.560 --> 07:41.050
information from the log database regarding source IP, host name, and category

199
07:41.050 --> 07:41.920
description.

200
07:41.920 --> 07:43.200
We want to filter it down.

201
07:43.200 --> 07:46.880
So I'm going to type in where space and then I'll put a dollar sign filter,

202
07:46.880 --> 07:48.360
another space.

203
07:48.360 --> 07:52.050
And after dollar sign filter, I'm also going to go ahead and put an and

204
07:52.050 --> 07:53.560
category description.

205
07:53.560 --> 07:57.160
So I'll select that here from the drop down, put in our space, where I have ADN

206
07:57.160 --> 07:57.680
, I need

207
07:57.680 --> 07:58.680
to fix that.

208
07:58.680 --> 08:00.240
And it's not ADN.
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

208
07:58.680 --> 08:00.240
And it's not ADN.

209
08:00.240 --> 08:03.940
So here I have where filter and category description, and I'll say equals, and

210
08:03.940 --> 08:04.640
I have a single

211
08:04.640 --> 08:08.160
quote and proxy avoidance and a close single quote.

212
08:08.160 --> 08:09.320
And then we can group them.

213
08:09.320 --> 08:12.760
Then I'll say group by source IP.

214
08:12.760 --> 08:16.500
I'll pick that from the list there after I start typing, comma, then host name,

215
08:16.500 --> 08:16.800
I'll

216
08:16.800 --> 08:21.240
select that and comma, category description and select that.

217
08:21.240 --> 08:24.630
I'll press enter one more time, then I'll type in order by and I'll go ahead

218
08:24.630 --> 08:25.200
and order

219
08:25.200 --> 08:27.480
by source IP and that looks good.

220
08:27.480 --> 08:32.160
Now let me tell you what, in prepping for this, it's been a long time since I
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

220
08:27.480 --> 08:32.160
Now let me tell you what, in prepping for this, it's been a long time since I

221
08:32.160 --> 08:32.720
've had

222
08:32.720 --> 08:35.080
to write any SQL statements whatsoever.

223
08:35.080 --> 08:38.200
So I use some help here with the validate options.

224
08:38.200 --> 08:41.360
I'll click on validate and it would tell me, oh, we need this.

225
08:41.360 --> 08:42.360
Oh, we need that.

226
08:42.360 --> 08:43.840
Oh, that doesn't make sense.

227
08:43.840 --> 08:48.190
So after several practices in a simple demo here, I finally got my syntax

228
08:48.190 --> 08:48.960
correct and

229
08:48.960 --> 08:51.560
the validate option here helped me as well.

230
08:51.560 --> 08:54.280
We could also click on format and that's just formats.

231
08:54.280 --> 08:57.970
It's the same query, but just done in a slightly different look and feel, which

232
08:57.970 --> 08:58.460
might make

233
08:58.460 --> 08:59.460
it easier to read.

234
08:59.460 --> 09:02.600
So once again, I'll click on validate here just to confirm I have all the
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

234
08:59.460 --> 09:02.600
So once again, I'll click on validate here just to confirm I have all the

235
09:02.600 --> 09:03.720
syntax it needs

236
09:03.720 --> 09:05.840
and it says no validation issues found.

237
09:05.840 --> 09:08.520
Now another cool thing here is we could actually test it.

238
09:08.520 --> 09:11.670
So I'll scroll down a little bit and I'm going to say for the time period, let

239
09:11.670 --> 09:12.520
's go ahead

240
09:12.520 --> 09:17.010
and let's say anytime this year, it currently is 2025 and so I'd be from

241
09:17.010 --> 09:18.080
January 1st all

242
09:18.080 --> 09:20.400
the way to the end of the year and I'll click on go.

243
09:20.400 --> 09:25.070
And that way you just validate for me that this dataset, which effectively is a

244
09:25.070 --> 09:25.720
SQL query

245
09:25.720 --> 09:28.480
with some filtering in place is actually working.

246
09:28.480 --> 09:31.040
So if we scroll down here, sure enough, here's the source IP.
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

246
09:28.480 --> 09:31.040
So if we scroll down here, sure enough, here's the source IP.

247
09:31.040 --> 09:34.760
So I don't have IPv6 in place and that's what this represents right here, but I

248
09:34.760 --> 09:35.040
do have

249
09:35.040 --> 09:37.400
IPv4 addresses on that device.

250
09:37.400 --> 09:40.830
Here's the host name where the proxy avoidance was triggered and there's the

251
09:40.830 --> 09:42.080
category description.

252
09:42.080 --> 09:44.280
So I know that this query works and we could use it.

253
09:44.280 --> 09:48.350
So I'll go ahead and click on okay and now if we click here on view options and

254
09:48.350 --> 09:48.840
we just

255
09:48.840 --> 09:52.950
take a look at custom datasets, there's our custom dataset and if we hover over

256
09:52.950 --> 09:53.320
it, it's

257
09:53.320 --> 09:55.440
showing us what that query is looking for.

258
09:55.440 --> 10:00.330
So now that we have this dataset in place, we could then use it as part of a
WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

258
09:55.440 --> 10:00.330
So now that we have this dataset in place, we could then use it as part of a

259
10:00.330 --> 10:01.040
report that

260
10:01.040 --> 10:02.040
we're creating.

261
10:02.040 --> 10:05.820
However, before we get to that point, I also want to walk you through how we

262
10:05.820 --> 10:06.640
can work with

263
10:06.640 --> 10:09.440
existing charts or create new charts as well.

264
10:09.440 --> 10:13.100
So in the next video, let's take a look at the chart library and many of the

265
10:13.100 --> 10:13.560
options

266
10:13.560 --> 10:17.570
we have there regarding charts that will be populated with the data pulled

267
10:17.570 --> 10:18.200
based on the

268
10:18.200 --> 10:19.200
dataset in use.

269
10:19.200 --> 10:22.280
So I'll see you in the next video as we take a look at charts.
