So, I’ll admit to you my dom0 had 128MB ram and no swapspace.
It was meant to be somewhat embedded and that used to be just fine.
I had to move roughly 300GB using pvmove.
It worked quite ok until at some point it suddenly ran out of Ram?!
Now neither will pvmove recover on the next call, nor will it abort, it’s just stuck with no working state.
And this is on a current system, but still behaviour so similar to what an irc friend ran into two years ago.
Loading vgxen-lvxensave table Suppressed vgxen-lvxensave identical table reload. Resuming vgxen-lvxensave (253:17) Found volume group "vgxen" Loading vgxen-pvmove0 table Suppressed vgxen-pvmove0 identical table reload. Loading vgxen-lv_lab_cent--lfw_swap table Suppressed vgxen-lv_lab_cent--lfw_swap identical table reload. Resuming vgxen-lv_lab_cent--lfw_swap (253:19) Found volume group "vgxen" Loading vgxen-pvmove0 table SKilled
A look into dmesg shows the issue:
HighMem per-cpu: empty Free pages: 1964kB (0kB HighMem) Active:9754 inactive:8 dirty:0 writeback:0 unstable:0 free:491 slab:2894 mapped-file:1270 mapped-anon:8577 pagetables:411 DMA free:652kB min:172kB low:212kB high:256kB active:7508kB inactive:0kB present:16384kB pages_scanned:1101018 all_unreclaimable? yes lowmem_reserve: 0 0 120 120 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve: 0 0 120 120 Normal free:1312kB min:1316kB low:1644kB high:1972kB active:31536kB inactive:4kB present:122880kB pages_scanned:2104924 all_unreclaimable? yes lowmem_reserve: 0 0 0 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve: 0 0 0 0 DMA: 1*4kB 1*8kB 0*16kB 0*32kB 0*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 652kB DMA32: empty Normal: 2*4kB 5*8kB 1*16kB 3*32kB 0*64kB 3*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1312kB HighMem: empty 1313 pagecache pages Swap cache: add 0, delete 0, find 0/0, race 0+0 Free swap = 0kB Total swap = 0kB Free swap: 0kB 34816 pages of RAM 0 pages of HIGHMEM 17953 reserved pages 2055 pages shared 0 pages swap cached 0 pages dirty 0 pages writeback 1270 pages mapped 2894 pages slab 411 pages pagetables Out of memory: Killed process 8899 (pvmove).
So, why am I here bitching if I configured too little ram?
Well, because the root cause was a memleak in pvmove – it had already worked for quite a few, and even bigger volumes without a problem and then suddenly it had sucked up all ram.
If we consider that the PE size in this VG was only 4meg, we can be quite sure it didn’t run out of space for data it was supposed to keep in ram, nothing more than 4meg and the bitmap for the volume in question could be in ram, that might be adding up to about 20MB for a bigger volume…
The next thing I’m just disgusted by is this whole “let’s do it in userspace stuff” – pvmove should do it’s job via some api to the kernel lvm driver, then the Oom killer would have caused no harm.
And, lastly, it is obviously low quality code considering it can only resume in theory but in practice I’ll have to pray, hard-reset, boot to runlevel one, retry the pvmove –abort and then pray some more.
I WISH the sistina people had invested some more time to really understand logical volume management when they “reimplemented” the hp-ux lvm, so there wouldn’t be so many points where lvm2 still breaks apart.
After all, that task / command pvmove has worked there about 12 years ago without any ram / cpu pressure even on 64meg boxes….