1
00:00:00,000 --> 00:00:11,560
In this video, you'll learn what you can do if you need to troubleshoot Linux memory.

2
00:00:11,560 --> 00:00:16,200
To start with, use 3-M. That will give you a generic overview of current memory usage.

3
00:00:16,200 --> 00:00:22,240
And you should check if a minimum of 20% of RAM is available as free or cached. If it's

4
00:00:22,240 --> 00:00:28,400
not, you need to further investigate what is going on. Also, consider the use of swap

5
00:00:28,400 --> 00:00:35,040
space. Are you currently using any swap? Do you even have swap? And if swap is used, use

6
00:00:35,040 --> 00:00:41,360
vmstat to get usage information. Also, check logs to find out if the own killer is being

7
00:00:41,360 --> 00:00:42,360
issued.

8
00:00:42,360 --> 00:00:47,480
Let me show you. In order to investigate memory problems, it would be cool if you have memory

9
00:00:47,480 --> 00:00:53,279
problems. And I'm going to reboot and limit the amount of available memory while rebooting

10
00:00:53,279 --> 00:00:57,720
so that we can see that actually something is happening. Because right now, on this four

11
00:00:57,720 --> 00:01:05,040
gig system, nothing really is happening. So once you are in a group menu, at the end

12
00:01:05,040 --> 00:01:11,639
of the line that loads the Linux kernel, temporarily remove rhdb and quiet and then use mem equals

13
00:01:11,639 --> 00:01:17,760
1g. That will make that you have one gigabyte of memory. And that makes it all a little

14
00:01:17,760 --> 00:01:24,800
bit more interesting. Next, control X to start. So to the terminal to investigate, and a good

15
00:01:24,800 --> 00:01:35,160
first step is to use 3 minus m. And now we see interesting stuff. So one gigabyte of

16
00:01:35,160 --> 00:01:40,000
memory is what I passed while starting up. Unfortunately, for my system, part of that

17
00:01:40,000 --> 00:01:47,839
memory is used for the graphics card. So we only have 708 remaining. And that means that

18
00:01:47,839 --> 00:01:57,680
we fail all sanity tasks. So 529 used, only 56 free. And you can also see that 770 mebibytes

19
00:01:57,680 --> 00:02:04,239
of memory is currently in swap. What does that mean? Well, we have more memory in swap.

20
00:02:04,239 --> 00:02:08,960
Then we have a total memory available. And that is perfect for troubleshooting. So I

21
00:02:08,960 --> 00:02:17,080
would like to use a VM stat to five, two second interval, five polling loops in total. And

22
00:02:17,080 --> 00:02:22,559
what I'm observing is swap activity. And here you can see the swap activity, which is pretty

23
00:02:22,559 --> 00:02:29,320
close to block activity. Now the thing that you are not supposed to like here is that swap in and

24
00:02:29,320 --> 00:02:34,520
swap out, it's continuously happening. And I wasn't even doing something. And that is really

25
00:02:34,520 --> 00:02:41,479
an indicator of a poor state of your of your memory. Of course, we have 10 times the amount

26
00:02:41,479 --> 00:02:46,479
of swap as the amount of RAM. So it's going to take a while before you are running out of

27
00:02:46,479 --> 00:02:53,960
swap. But you are seriously running out of memory. I would also advise you grab on active

28
00:02:53,960 --> 00:03:00,440
memories in proc meminfo. I'm skipping the A by the way, so that I don't have to worry about

29
00:03:00,440 --> 00:03:07,160
this as an uppercase or lowercase. And this is showing what I want to see. So what I see is that

30
00:03:07,160 --> 00:03:16,119
I have this amount of active memory. And here I have my inactive memory. And this is application

31
00:03:16,119 --> 00:03:26,199
memory. And the thing is 104244 kilobytes, that means we have 104 megabytes of inactive application

32
00:03:26,199 --> 00:03:31,759
memory, the amount of memory that we have in swap is way higher. And that means that active memory is

33
00:03:31,759 --> 00:03:37,479
being swapped out. As it doesn't show as active and inactive anonymous memory, which is your typical

34
00:03:37,479 --> 00:03:44,759
application memory, we are most likely talking about kernel procedures and routines. So beyond any

35
00:03:44,759 --> 00:03:53,240
doubt, this system is way out of memory. Then I'm going to do something fun to trigger an out of

36
00:03:53,240 --> 00:03:59,600
memory, I want you to be able to recognize an out of memory condition. And in order to do so, first

37
00:03:59,600 --> 00:04:08,639
I'm echoing an H to the file proc sysrq trigger. The sysrq trigger is giving access to some magical

38
00:04:08,639 --> 00:04:15,720
controls on your Linux. Now, if you echo an H, you get help. And this help is only visible in the D

39
00:04:15,720 --> 00:04:22,119
message output. And the help I'm looking for is the memory full ohm kill. You can see that on the

40
00:04:22,160 --> 00:04:29,279
bottom, it's the letter F that we need to echo. So this time, I'm going to echo an F to proc sysrq

41
00:04:29,279 --> 00:04:38,079
trigger. And oh boy, I get oh, do you see that device memory is almost full. Terminal was using a lot of

42
00:04:38,079 --> 00:04:45,799
memory. And I'm going to start my terminal again, at least I'm going to try, because I would like to show

43
00:04:45,799 --> 00:05:00,079
you what you can find in the logs if this happens. So journal CTL, uppercase G to go all the way to the

44
00:05:00,079 --> 00:05:07,920
end. And then I'm scrolling up a little bit. And here I can see out of memory, out of memory killed

45
00:05:07,920 --> 00:05:14,720
process, whatever, what you should be looking for is ohm kill. Now, by echoing an F into this proc sysrq

46
00:05:14,720 --> 00:05:21,799
trigger, I manually force an ohm kill. What you should be worried about is if you can see it in any other

47
00:05:21,799 --> 00:05:29,079
circumstances. Now, this is the same ohm kill than the one that I just triggered. And if there's only one of

48
00:05:29,079 --> 00:05:36,600
them, then it's not an issue. But if you have multiple out of memory killers that occurred, then really, you

49
00:05:36,600 --> 00:05:45,200
have a problem. So that is how you can identify poorly behaving memory. Let's boot back into a normal system. So

50
00:05:45,200 --> 00:05:52,320
one solution if you are running out of memory is to allocate some swap. So swap space can be created on a block

51
00:05:52,320 --> 00:05:59,559
device or on a swap file. And if you are using a partition, you must set the partition type to swap and actually

52
00:05:59,559 --> 00:06:07,839
use mkswap to format it as a swap partition. And you swap on to activate the swap space. Normally, swap is

53
00:06:07,839 --> 00:06:15,440
mounted to ETCFS step. And from there, systemd will pick it up and mount it as a swap unit type. Let me show you

54
00:06:15,440 --> 00:06:23,000
how this works. So let's talk about swap. We are back in a normal operational state where we have 8 gigs of swap,

55
00:06:23,000 --> 00:06:29,880
which is not used at all. But I want to show you how to use swap. So what am I going to do? Well, I'm going to use

56
00:06:29,880 --> 00:06:42,320
dd. If is dev 0. Of is slash swap file. We're going to make a swap file this time. BS is 1M. Oops, BS is 1M. And

57
00:06:42,320 --> 00:06:51,519
count is 1024 to make for a one gigabyte swap file. Good. Then what are we going to do? Well, I don't have to create a

58
00:06:51,519 --> 00:07:01,399
partition this time. But I do need mkswap on slash swap file to format the swap file as a swap device. And there you

59
00:07:01,399 --> 00:07:08,600
can see that the mkswap command is kind enough to tell me about the insecure permissions. So I need to fix it using

60
00:07:08,600 --> 00:07:18,200
chmod 600 on slash swap file. The reason is that only the root users should have read and write access to the swap file,

61
00:07:18,200 --> 00:07:26,679
nobody else. You can even see that the swap file has created a label. Now, as this is a file, and files typically don't

62
00:07:26,679 --> 00:07:36,720
change their name spontaneously, I'm just going to put it in etcfs step as such. So slash swap file. And I'm going to

63
00:07:36,720 --> 00:07:44,519
mount it the way you should mount swap. And for swap, we have seen it before. There's a couple of examples here. So how do I

64
00:07:44,519 --> 00:07:55,040
mount it? Well, I mount it on none. The type is swap. The mount options are defaults, and then the 0 and 0. And next, I can

65
00:07:55,040 --> 00:08:03,799
use swap on minus a that's comparable to mount minus a, and it will activate all swap that is in etcfs step and not yet

66
00:08:04,079 --> 00:08:13,799
initialized. So when I use 3 minus m, I can see that I have another gigabyte of swap available. Swap on minus s, by the way,

67
00:08:13,799 --> 00:08:21,440
is showing all the different swap devices. So here we can see that we really got it all. We have a swap logical volume, a swap

68
00:08:21,440 --> 00:08:31,160
partition and a swap file. Also interesting to see is the priority, minus 2, minus 3, minus 4. The minus 2 priority is used

69
00:08:31,160 --> 00:08:38,520
first. And that's how you can add swap to give your system the opportunity to calm down if it is running out of memory.

