Hi,
I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
a busybox on an ARM-system but it seems to leak memory... during the
working hours.
I know it is not the best way to do but I've monitoring the MemFree
value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
having 100M memory free makes it crash after two days. Or at least what
I think makes it crash, I need more monitored data to be sure.
So I'm wondering: how would you monitor such kind of problem? How to
find out if it is a kernel issue or related to some running program?
Kind regards,
wimpunk.
回覆者 唐Y●九月22,20162016-09-22
On 9/22/2016 1:51 AM, wimpunk wrote:
> Hi,
>
> I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
> a busybox on an ARM-system but it seems to leak memory... during the
> working hours.
This suggests there are "non-working hours" (?). Does it incur losses
at those times? (Or, is it actually NOT working/running?)
What's it *doing* during the working hours? What *might* be calling for
additional memory in that time?
Is there a persistent store or are you logging to a memory disk, etc.?
> I know it is not the best way to do but I've monitoring the MemFree
> value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
> having 100M memory free makes it crash after two days. Or at least what
> I think makes it crash, I need more monitored data to be sure.
>
> So I'm wondering: how would you monitor such kind of problem? How to
> find out if it is a kernel issue or related to some running program?
Where does proc/meminfo show INCREASING memory usage?
When you kill(8) off "your" processes (i.e., anything that was not
part of the standard "system"), is the memory recovered correctly
by the kernel? Said another way, if you created a cron(8) task
to kill off your processes every hour and restart them immediately
thereafter, would the problem "go away" (i.e., be limited to a
maximum, non-accumulating loss of ~2MB)?
Once you know which process(es) are responsible for the loss, you
can explore them in greater detail.
回覆者 蒂姆·韦斯科特●九月22,20162016-09-22
On Thu, 22 Sep 2016 10:51:39 +0200, wimpunk wrote:
> Hi,
>
> I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
> a busybox on an ARM-system but it seems to leak memory... during the
> working hours.
> I know it is not the best way to do but I've monitoring the MemFree
> value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
> having 100M memory free makes it crash after two days. Or at least what
> I think makes it crash, I need more monitored data to be sure.
>
> So I'm wondering: how would you monitor such kind of problem? How to
> find out if it is a kernel issue or related to some running program?
>
> Kind regards,
>
> wimpunk.
Can't you look at memory usage on a task-by-task basis with ps? How
about periodically running it, and looking for a task that's blowing up?
--
Tim Wescott
Control systems, embedded software and circuit design
I'm looking for work! See my website if you're interested
http://www.wescottdesign.com
回覆者 约翰·克拉默(Johann Klammer)●九月22,20162016-09-22
On 09/22/2016 10:51 AM, wimpunk wrote:
> Hi,
>
> I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
> a busybox on an ARM-system but it seems to leak memory... during the
> working hours.
> I know it is not the best way to do but I've monitoring the MemFree
> value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
> having 100M memory free makes it crash after two days. Or at least what
> I think makes it crash, I need more monitored data to be sure.
>
> So I'm wondering: how would you monitor such kind of problem? How to
> find out if it is a kernel issue or related to some running program?
>
> Kind regards,
>
> wimpunk.
>
Not arm here, but old x86, so maybe not helpful:
this box (512 Mb ram) has MemFree: 13740 kB
file server (128 Mb ram) has MemFree: 2248kB
It's the linux vm caching all sorts of stuff in ram.
sooner or later forks will fail, or modules might not load.
(basically anything that wants a bigger chunk of cont. memory)
echo 3 > /proc/sys/vm/drop_caches
frees some memory. see if that helps.
(use periodically)
...There's another tunable (/proc/sys/vm/user_reserve_kbytes I think)
which is supposed to help with that. but in the past I had tried that and
module loading would still sometimes fail, and the box would go into swap-storms all the time.
Idk... perhaps they fixed that by now...
> On 09/22/2016 10:51 AM, wimpunk wrote:
> > Hi,
> >
> > I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
> > a busybox on an ARM-system but it seems to leak memory... during the
> > working hours.
> > I know it is not the best way to do but I've monitoring the MemFree
> > value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
> > having 100M memory free makes it crash after two days. Or at least what
> > I think makes it crash, I need more monitored data to be sure.
> >
> > So I'm wondering: how would you monitor such kind of problem? How to
> > find out if it is a kernel issue or related to some running program?
> >
> > Kind regards,
> >
> > wimpunk.
> >
> Not arm here, but old x86, so maybe not helpful:
>
> this box (512 Mb ram) has MemFree: 13740 kB
> file server (128 Mb ram) has MemFree: 2248kB
>
> It's the linux vm caching all sorts of stuff in ram.
> sooner or later forks will fail, or modules might not load.
> (basically anything that wants a bigger chunk of cont. memory)
>
>
> echo 3 > /proc/sys/vm/drop_caches
>
> frees some memory. see if that helps.
> (use periodically)
There is no point in doing that.
Kernel will automatically drop caches if processes need it.
Bye Jack
--
Yoda of Borg am I! Assimilated shall you be! Futile resistance is, hmm?
回覆者 大卫·布朗●九月23,20162016-09-23
On 22/09/16 10:51, wimpunk wrote:
> Hi,
>
> I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
> a busybox on an ARM-system but it seems to leak memory... during the
> working hours.
> I know it is not the best way to do but I've monitoring the MemFree
> value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
> having 100M memory free makes it crash after two days. Or at least what
> I think makes it crash, I need more monitored data to be sure.
>
> So I'm wondering: how would you monitor such kind of problem? How to
> find out if it is a kernel issue or related to some running program?
>
> Kind regards,
>
> wimpunk.
>
What /exactly/ are you monitoring from /proc/meminfo? If you are
looking at MemFree, then you can expect it to go down regularly - once a
system has been used for a while, you don't want MemFree to be more than
about 10% of the systems memory. Remember, Linux uses free memory for
disk cache. It will clear out old disk cache if it needs the memory for
something else, but if the memory is not being used for processes, then
it is always best to store file data in the spare ram.
So if your system is doing nothing but writing logs to the disk, then it
will use steadily more memory for disk caching of the log files. It may
not be particularly useful to have the log files in cache, but it is
more useful than having nothing at all in memory.
Your key figure for the memory in use by processes (and therefore the
memory that might be leaking), is MemFree - Buffers - Cached.
回覆者 温朋克●九月23,20162016-09-23
On 22/09/16 16:38, Don Y wrote:
> On 9/22/2016 1:51 AM, wimpunk wrote:
>> Hi,
>>
>> I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
>> a busybox on an ARM-system but it seems to leak memory... during the
>> working hours.
>
> This suggests there are "non-working hours" (?). Does it incur losses
> at those times? (Or, is it actually NOT working/running?)
It means not between 8 in the morning and 7 in the evening.
>
> What's it *doing* during the working hours? What *might* be calling for
> additional memory in that time?
>
> Is there a persistent store or are you logging to a memory disk, etc.?
>
We are saving the MemFree on a monitoring server.
>> I know it is not the best way to do but I've monitoring the MemFree
>> value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
>> having 100M memory free makes it crash after two days. Or at least what
>> I think makes it crash, I need more monitored data to be sure.
>>
>> So I'm wondering: how would you monitor such kind of problem? How to
>> find out if it is a kernel issue or related to some running program?
>
> Where does proc/meminfo show INCREASING memory usage?
>
> When you kill(8) off "your" processes (i.e., anything that was not
> part of the standard "system"), is the memory recovered correctly
> by the kernel? Said another way, if you created a cron(8) task
> to kill off your processes every hour and restart them immediately
> thereafter, would the problem "go away" (i.e., be limited to a
> maximum, non-accumulating loss of ~2MB)?
>
> Once you know which process(es) are responsible for the loss, you
> can explore them in greater detail.
>
Actually, the box is doing nothing, so there is pretty less to kill.
There is an ssh server on which we regularly connect to get
/proc/meminfo. The contents of MemFree is added to our monitoring system.
After monitoring MemFree for two days on two different systems, this is
what we got: //imagebin.ca/v/2w2uH4yCnAGu
回覆者 温朋克●九月23,20162016-09-23
On 22/09/16 16:53, Tim Wescott wrote:
> On Thu, 22 Sep 2016 10:51:39 +0200, wimpunk wrote:
>
>> Hi,
>>
>> I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
>> a busybox on an ARM-system but it seems to leak memory... during the
>> working hours.
>> I know it is not the best way to do but I've monitoring the MemFree
>> value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
>> having 100M memory free makes it crash after two days. Or at least what
>> I think makes it crash, I need more monitored data to be sure.
>>
>> So I'm wondering: how would you monitor such kind of problem? How to
>> find out if it is a kernel issue or related to some running program?
>>
>> Kind regards,
>>
>> wimpunk.
>
> Can't you look at memory usage on a task-by-task basis with ps? How
> about periodically running it, and looking for a task that's blowing up?
>
Hm, didn't know ps could show me the used memory...
Been searching, but I only found a way to show the percentage of memory.
I don't think that is accurate enough to see much difference.
回覆者 温朋克●九月23,20162016-09-23
On 22/09/16 18:02, Johann Klammer wrote:
> On 09/22/2016 10:51 AM, wimpunk wrote:
>> Hi,
>>
>> I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
>> a busybox on an ARM-system but it seems to leak memory... during the
>> working hours.
>> I know it is not the best way to do but I've monitoring the MemFree
>> value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
>> having 100M memory free makes it crash after two days. Or at least what
>> I think makes it crash, I need more monitored data to be sure.
>>
>> So I'm wondering: how would you monitor such kind of problem? How to
>> find out if it is a kernel issue or related to some running program?
>>
>> Kind regards,
>>
>> wimpunk.
>>
> Not arm here, but old x86, so maybe not helpful:
>
> this box (512 Mb ram) has MemFree: 13740 kB
> file server (128 Mb ram) has MemFree: 2248kB
>
> It's the linux vm caching all sorts of stuff in ram.
> sooner or later forks will fail, or modules might not load.
> (basically anything that wants a bigger chunk of cont. memory)
>
>
> echo 3 > /proc/sys/vm/drop_caches
>
> frees some memory. see if that helps.
> (use periodically)
>
> ...There's another tunable (/proc/sys/vm/user_reserve_kbytes I think)
> which is supposed to help with that. but in the past I had tried that and
> module loading would still sometimes fail, and the box would go into swap-storms all the time.
> Idk... perhaps they fixed that by now...
>
>
I could use the drop_caches part when monitoring but according to top
the caches are pretty stable. I don't think I'm trapped by the kernel
cache.
回覆者 温朋克●九月23,20162016-09-23
On 22/09/16 21:17, Jack wrote:
> Johann Klammer <klammerj@NOSPAM.a1.net> wrote:
>
>> On 09/22/2016 10:51 AM, wimpunk wrote:
>>> Hi,
>>>
>>> I'm stuck with a nasty problem. I'm running an 4.4.13 based kernel with
>>> a busybox on an ARM-system but it seems to leak memory... during the
>>> working hours.
>>> I know it is not the best way to do but I've monitoring the MemFree
>>> value from /proc/meminfo and I'm losing 2M/hour. It's not that big but
>>> having 100M memory free makes it crash after two days. Or at least what
>>> I think makes it crash, I need more monitored data to be sure.
>>>
>>> So I'm wondering: how would you monitor such kind of problem? How to
>>> find out if it is a kernel issue or related to some running program?
>>>
>>> Kind regards,
>>>
>>> wimpunk.
>>>
>> Not arm here, but old x86, so maybe not helpful:
>>
>> this box (512 Mb ram) has MemFree: 13740 kB
>> file server (128 Mb ram) has MemFree: 2248kB
>>
>> It's the linux vm caching all sorts of stuff in ram.
>> sooner or later forks will fail, or modules might not load.
>> (basically anything that wants a bigger chunk of cont. memory)
>>
>>
>> echo 3 > /proc/sys/vm/drop_caches
>>
>> frees some memory. see if that helps.
>> (use periodically)
>
> There is no point in doing that.
> Kernel will automatically drop caches if processes need it.
>
> Bye Jack
>
But I consider it as a good idea. It could have happened I didn't take
the cache in count.