1
00:00:00,000 --> 00:00:18,260
Hey guys and welcome back. So now what I want to briefly talk to you about is the concept

2
00:00:18,260 --> 00:00:23,839
of being able to predict future resource needs. Now to begin, one of the first tools

3
00:00:23,839 --> 00:00:30,079
I want to talk to you about is something called collectD. Now what this is, is ultimately

4
00:00:30,079 --> 00:00:37,840
a daemon. And this daemon has the purpose of collecting system statistics. Now the way

5
00:00:37,840 --> 00:00:43,840
collectD actually does this is that it uses something called plugins. And these plugins

6
00:00:43,840 --> 00:00:51,120
actually determine what collectD will actually collect. Now one thing to note about the collectD

7
00:00:52,079 --> 00:00:57,759
is that there are some limitations here. This is not an all purpose tool. For example, the purpose

8
00:00:57,759 --> 00:01:04,479
is really limited to collecting. And I would tell if I could spell that correctly. Collecting data.

9
00:01:04,479 --> 00:01:10,719
That means that it's not going to display the data for you any nice graphical form or anything

10
00:01:10,719 --> 00:01:16,560
like that or analyze the data. Instead collectD is just going to like I say collect that data and

11
00:01:16,560 --> 00:01:22,960
you can then feed that data if you so wish into another program. Now collectD typically is not

12
00:01:22,960 --> 00:01:29,520
installed by default on most Linux systems. So we do actually have to run through the installation.

13
00:01:29,520 --> 00:01:35,920
So what I will say here on my Zubuntu machine, I can just use the app package manager and install

14
00:01:35,920 --> 00:01:43,680
collectD. Just type in my passwords and hit Y and enter. Okay, perfect. So now that we have collectD

15
00:01:43,680 --> 00:01:48,160
installed, let me just clear the screen. One of the things that we want to be aware of for the

16
00:01:48,160 --> 00:01:55,200
purposes of the examination is where we can actually configure the collectD daemon. Now we

17
00:01:55,200 --> 00:02:00,240
are going to be able to do this by modifying a file in a particular location. So let me show you what

18
00:02:00,240 --> 00:02:07,280
this location is. As if I go into the etsy directory and then go into collectD. If I do an ls, we should

19
00:02:07,280 --> 00:02:15,199
now see this collectD.conf file. Okay, so try to remember that path etsy collectD collectD.conf.

20
00:02:15,199 --> 00:02:21,840
If I go in and modify this, let's say nano for example and hit enter. This is the main configuration

21
00:02:21,840 --> 00:02:29,199
file. Now within here, we can do many cool things. We can actually determine how often we should be

22
00:02:29,199 --> 00:02:35,520
able to collect particular data. Say for example here, we can see we have this value here interval

23
00:02:36,159 --> 00:02:43,920
10. This means that every 10 seconds, we will have a new query collecting more data. Now this is the

24
00:02:43,920 --> 00:02:50,640
default value. If we want to happen to change this, we would uncomment this configuration by removing

25
00:02:50,640 --> 00:02:54,800
the what is now called the hashtag and we can see the color changed. If we could change this

26
00:02:54,800 --> 00:03:02,160
interval to maybe say every five seconds or every 15 seconds, however you choose. But for now, I'm

27
00:03:02,159 --> 00:03:07,199
actually not too bothered with this. So I'll just comment this out. So it's inactive and we'll still

28
00:03:07,199 --> 00:03:13,680
have the default of 10. Now if we scroll on down, another really important thing to notice is this

29
00:03:13,680 --> 00:03:20,240
plugin section right here. This is going to allow us to specify which features we actually want to

30
00:03:20,240 --> 00:03:25,680
collect information on. Now if you happen to scroll down, you'll notice that most of these plugins

31
00:03:25,680 --> 00:03:32,400
are commented out. The ones which are actually on by default are the battery, the CPU, the disk,

32
00:03:32,400 --> 00:03:38,960
entropy, interfaces, memory so on so forth. But the vast majority like I say are commented out. But

33
00:03:38,960 --> 00:03:44,960
if we wanted to decide that hey, we actually want to collect information on our network, we could go

34
00:03:44,960 --> 00:03:51,439
to the plugin network and we could just remove the hashtag to uncomment out this configuration.

35
00:03:51,520 --> 00:03:57,520
And this would allow us to activate this configuration. Say for example, I could just save this up and

36
00:03:57,520 --> 00:04:03,280
notice I should have pointed this out. This is a global configuration file within the Etsy directory.

37
00:04:03,280 --> 00:04:11,280
Therefore, I'm going to have to use super user privileges. So I would say sudo nano collectd.conf

38
00:04:11,280 --> 00:04:18,800
and scroll on down. Now I could remove and save the changes. And in order for these changes to

39
00:04:18,800 --> 00:04:26,160
take effect, we would actually have to reload the service. I could say sudo system ctl restart

40
00:04:26,160 --> 00:04:33,920
collectd.service. If I hit enter, that should now reload the service and the new configurations,

41
00:04:33,920 --> 00:04:41,360
i.e. the changes we have just made will take effect. We can actually use collectd to monitor our

42
00:04:41,360 --> 00:04:47,280
network statistics. So really for the purposes of the examination, understand what collectd does

43
00:04:47,279 --> 00:04:54,159
and what it does not do and really try to remember the location of the configuration file and understand

44
00:04:54,159 --> 00:04:59,679
that we have these concepts of plugins. And if we want to activate a particular plugin so that we

45
00:04:59,679 --> 00:05:05,919
can measure a particular resource such as a CPU or what is going on in our network, we can just

46
00:05:05,919 --> 00:05:12,079
simply uncomment that plugin so that it will be active after we reload the configuration file.

47
00:05:12,159 --> 00:05:18,079
Now another thing I want to just briefly touch upon is the subject of monitoring solutions.

48
00:05:18,079 --> 00:05:23,359
There are some solutions we have to have an awareness of, at least at a high level,

49
00:05:23,359 --> 00:05:29,199
for the purposes of the examination. But like I say, this is just a high level topic,

50
00:05:29,199 --> 00:05:34,799
meaning that you don't have to really concern yourself with how to configure these tools

51
00:05:34,799 --> 00:05:41,439
specifically, just more have an awareness of what they actually do. So the first monitoring solution

52
00:05:41,439 --> 00:05:47,600
I want to point out is one called Nagios or Nagios. I've heard it said both ways, I'm not quite sure

53
00:05:47,600 --> 00:05:55,120
the correct way. Now Nagios is ultimately a family of projects. So we could have a Nagios log server

54
00:05:55,120 --> 00:06:02,000
or a Nagios network analyzer or a Nagios incident manager, so on and so forth. Now the cool thing

55
00:06:02,000 --> 00:06:08,560
about Nagios is that it can provide a plugin to interact with collectd. We can actually use

56
00:06:08,560 --> 00:06:15,280
collectd-nagios and that can actually be used to query the collectd

57
00:06:15,280 --> 00:06:21,199
daemon for information. So the information that is collected by collectd, we can feed it into

58
00:06:21,199 --> 00:06:29,040
Nagios and use the Nagios network analyzer to analyze the network traffic or network statistics,

59
00:06:29,040 --> 00:06:35,920
should I say, that we have gathered using collectd. Now another solution we have is one called MRTG.

60
00:06:35,920 --> 00:06:43,600
This is the multi-router traffic grapher and this is really used to monitor network traffic loads.

61
00:06:43,600 --> 00:06:48,480
It basically reads information from your router and your logs and you can actually take this

62
00:06:48,480 --> 00:06:54,960
information and graph it. This can provide a nice visual interpretation of that information

63
00:06:54,960 --> 00:07:00,560
to make it much more digestible for you as a system administrator. Now another one we want to be

64
00:07:00,639 --> 00:07:09,519
aware of is one called cacti. This too, like MRTG, also focuses on network monitoring. Now this tool

65
00:07:09,519 --> 00:07:18,000
is basically a front-end interface for something called RRD tool. Now this stands for the round

66
00:07:18,000 --> 00:07:25,040
robin database tool. So simply put, information collected by the round robin database tool

67
00:07:25,040 --> 00:07:31,760
can be ingested by cacti and cacti will provide to you a nice easy interface so that you can

68
00:07:31,760 --> 00:07:37,840
display information or graph particular data received via this tool. Now the key thing to note

69
00:07:37,840 --> 00:07:44,560
about all of these types of tools is that we can take this information, monitor our data and we can

70
00:07:44,560 --> 00:07:51,920
use that information to observe and use for capacity planning. Say for example you happen to note

71
00:07:51,920 --> 00:07:57,360
on your graph you're having a very large and prolonged spike and maybe say network traffic.

72
00:07:57,360 --> 00:08:03,840
You might think hey but actually getting much more network use maybe it's now time to increase our

73
00:08:03,840 --> 00:08:10,080
bandwidth because if we keep following this trajectory of network use just getting higher and

74
00:08:10,080 --> 00:08:16,160
higher and higher suddenly we're going to get a bottleneck and the users are going to suffer

75
00:08:16,160 --> 00:08:21,840
and we definitely do not want that. So having this type of monitoring information allows us to plan

76
00:08:21,839 --> 00:08:26,959
ahead and we can see trends on the horizon as they're approaching. So really these tools all

77
00:08:26,959 --> 00:08:34,159
together is going to help you identify resource exhaustion and allow you to predict growth and

78
00:08:34,159 --> 00:08:41,600
even diagnose particular problems. If you happen to see a spike in some particular resource, CPU,

79
00:08:41,600 --> 00:08:47,279
maybe it is network traffic, you can then use that information to potentially identify a

80
00:08:47,279 --> 00:08:54,240
particular problem on your network. So the ability to collect data to be able to feed that data into

81
00:08:54,240 --> 00:09:00,959
a tool which can graph that data or analyze that data, this is obviously very valuable for anyone

82
00:09:00,959 --> 00:09:07,199
administering and managing the systems on our network. Now one thing to say about these tools

83
00:09:07,199 --> 00:09:13,759
just as a disclaimer is that there is no way you can get 100 accuracy with these tools. You might

84
00:09:13,759 --> 00:09:19,360
be able to graph a particular trend but there is no way for you to know that that trend is going to

85
00:09:19,360 --> 00:09:25,039
continue. Say for example you've got a nice graph which is showing a very steady rate of growth

86
00:09:25,039 --> 00:09:30,639
over a particular period of time. There is no way for you to know that suddenly you're going to

87
00:09:30,639 --> 00:09:37,279
encounter an abnormal spike. Okay so just be aware that these tools definitely have a use,

88
00:09:37,279 --> 00:09:42,559
they definitely provide valuable information but there are limitations to their abilities.

89
00:09:42,559 --> 00:09:46,799
Okay dogs, I hope this has been informative for you and I'd like to thank you for viewing.