linux-2.6

dect

Archived

Author	SHA1	Message	Date
Rusty Russell	005f8eee6f	[S390] cpumask: use mm_cpumask() wrapper Makes code futureproof against the impending change to mm->cpu_vm_mask. It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-26 15:24:34 +01:00
Heiko Carstens	1485c5c884	[S390] move EXPORT_SYMBOLs to definitions Move all EXPORT_SYMBOLs to their corresponding definitions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-26 15:24:11 +01:00
Carsten Otte	702d9e584f	[S390] check addressing mode in s390_enable_sie The sie instruction requires address spaces to be switched to run proper. This patch verifies that this is the case in s390_enable_sie, otherwise the kernel would crash badly as soon as the process runs into sie. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-26 15:24:09 +01:00
Heiko Carstens	59fa4392dd	[S390] page fault: invoke oom-killer s390 arch backend for `1c0fe6e3bd` "mm: invoke oom-killer from page fault". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-26 15:24:03 +01:00
Martin Schwidefsky	0fb1d9bcbc	[S390] make page table upgrade work again After TASK_SIZE now gives the current size of the address space the upgrade of a 64 bit process from 3 to 4 levels of page table needs to use the arch_mmap_check hook to catch large mmap lengths. The get_unmapped_area* functions need to check for -ENOMEM from the arch_get_unmapped_area*, upgrade the page table and retry. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-18 13:28:13 +01:00
Martin Schwidefsky	f481bfafd3	[S390] make page table walking more robust Make page table walking on s390 more robust. The current code requires that the pgd/pud/pmd/pte loop is only done for address ranges that are below the end address of the last vma of the address space. But this is not always true, e.g. the generic page table walker does not guarantee this. Change TASK_SIZE/TASK_SIZE_OF to reflect the current size of the address space. This makes the generic page table walker happy but it breaks the upgrade of a 3 level page table to a 4 level page table. To make the upgrade work again another fix is required. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-18 13:28:13 +01:00
Gary Hade	c04fc586c1	mm: show node to memory section relationship with symlinks in sysfs Show node to memory section relationship with symlinks in sysfs Add /sys/devices/system/node/nodeX/memoryY symlinks for all the memory sections located on nodeX. For example: /sys/devices/system/node/node1/memory135 -> ../../memory/memory135 indicates that memory section 135 resides on node1. Also revises documentation to cover this change as well as updating Documentation/ABI/testing/sysfs-devices-memory to include descriptions of memory hotremove files 'phys_device', 'phys_index', and 'state' that were previously not described there. In addition to it always being a good policy to provide users with the maximum possible amount of physical location information for resources that can be hot-added and/or hot-removed, the following are some (but likely not all) of the user benefits provided by this change. Immediate: - Provides information needed to determine the specific node on which a defective DIMM is located. This will reduce system downtime when the node or defective DIMM is swapped out. - Prevents unintended onlining of a memory section that was previously offlined due to a defective DIMM. This could happen during node hot-add when the user or node hot-add assist script onlines _all_ offlined sections due to user or script inability to identify the specific memory sections located on the hot-added node. The consequences of reintroducing the defective memory could be ugly. - Provides information needed to vary the amount and distribution of memory on specific nodes for testing or debugging purposes. Future: - Will provide information needed to identify the memory sections that need to be offlined prior to physical removal of a specific node. Symlink creation during boot was tested on 2-node x86_64, 2-node ppc64, and 2-node ia64 systems. Symlink creation during physical memory hot-add tested on a 2-node x86_64 system. Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:00 -08:00
Jens Axboe	abf137dd77	aio: make the lookup_ioctx() lockless The mm->ioctx_list is currently protected by a reader-writer lock, so we always grab that lock on the read side for doing ioctx lookups. As the workload is extremely reader biased, turn this into an rcu hlist so we can make lookup_ioctx() lockless. Get rid of the rwlock and use a spinlock for providing update side exclusion. There's usually only 1 entry on this list, so it doesn't make sense to look into fancier data structures. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Hongjie Yang	93098bf015	[S390] convert dcssblk and extmem printks messages to pr_xxx macros. Signed-off-by: Hongjie Yang <hongjie@us.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-12-25 13:39:23 +01:00
Christian Borntraeger	250cf776f7	[S390] pgtables: Fix race in enable_sie vs. page table ops The current enable_sie code sets the mm->context.pgstes bit to tell dup_mm that the new mm should have extended page tables. This bit is also used by the s390 specific page table primitives to decide about the page table layout - which means context.pgstes has two meanings. This can cause any kind of bugs. For example - e.g. shrink_zone can call ptep_clear_flush_young while enable_sie is running. ptep_clear_flush_young will test for context.pgstes. Since enable_sie changed that value of the old struct mm without changing the page table layout ptep_clear_flush_young will do the wrong thing. The solution is to split pgstes into two bits - one for the allocation - one for the current state Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-10-28 11:12:03 +01:00
Badari Pulavarty	71088785c6	mm: cleanup to make remove_memory() arch-neutral There is nothing architecture specific about remove_memory(). remove_memory() function is common for all architectures which support hotplug memory remove. Instead of duplicating it in every architecture, collapse them into arch neutral function. [akpm@linux-foundation.org: fix the export] Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Gary Hade <garyhade@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:25 -07:00
Hongjie Yang	b2300b9efe	[S390] dcssblk: add >2G DCSSs support and stacked contiguous DCSSs support. The DCSS block device driver is modified to add >2G DCSSs support and allow a DCSS block device to map to a set of contiguous DCSSs. The extmem code is also modified to use new Diagnose x'64' subcodes for >2G DCSSs. Signed-off-by: Hongjie Yang <hongjie@us.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-10-10 21:33:57 +02:00
Gerald Schaefer	7e9238fbc1	[S390] Add support for memory hot-remove. This patch enables memory hot-remove on s390. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-08-01 16:39:33 +02:00
Johannes Weiner	c55281dee0	s390: use generic show_mem() Remove arch-specific show_mem() in favor of the generic version. This also removes the following redundant information display: - pages in swapcache, printed by show_swap_cache_info() where show_mem() calls show_free_areas(), which calls show_swap_cache_info(). Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:11 -07:00
Andi Kleen	ceb8687961	hugetlb: introduce pud_huge Straight forward extensions for huge pages located in the PUD instead of PMDs. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:18 -07:00
Andi Kleen	a551643895	hugetlb: modular state for hugetlb page size The goal of this patchset is to support multiple hugetlb page sizes. This is achieved by introducing a new struct hstate structure, which encapsulates the important hugetlb state and constants (eg. huge page size, number of huge pages currently allocated, etc). The hstate structure is then passed around the code which requires these fields, they will do the right thing regardless of the exact hstate they are operating on. This patch adds the hstate structure, with a single global instance of it (default_hstate), and does the basic work of converting hugetlb to use the hstate. Future patches will add more hstate structures to allow for different hugetlbfs mounts to have different page sizes. [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Heiko Carstens	421c175c4d	[S390] Add support for memory hot-add. Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-07-14 10:02:16 +02:00
Linus Torvalds	a4df1ac12d	Merge branch 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: KVM: MMU: Fix is_empty_shadow_page() check KVM: MMU: Fix printk() format string KVM: IOAPIC: only set remote_irr if interrupt was injected KVM: MMU: reschedule during shadow teardown KVM: VMX: Clear CR4.VMXE in hardware_disable KVM: migrate PIT timer KVM: ppc: Report bad GFNs KVM: ppc: Use a read lock around MMU operations, and release it on error KVM: ppc: Remove unmatched kunmap() call KVM: ppc: add lwzx/stwz emulation KVM: ppc: Remove duplicate function KVM: s390: Fix race condition in kvm_s390_handle_wait KVM: s390: Send program check on access error KVM: s390: fix interrupt delivery KVM: s390: handle machine checks when guest is running KVM: s390: fix locking order problem in enable_sie KVM: s390: use yield instead of schedule to implement diag 0x44 KVM: x86 emulator: fix hypercall return value on AMD KVM: ia64: fix zero extending for mmio ld1/2/4 emulation in KVM	2008-06-11 10:35:44 -07:00
Heiko Carstens	ee0ddadd08	[S390] vmemmap: fix off-by-one bug. If a memory range is supposed to be added to the 1:1 mapping and it ends just below the maximum supported physical address it won't succeed. This is because a test doesn't consider that the end address is 1 smaller than start + size. Fix the comparison. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-06-10 10:03:27 +02:00
Christian Borntraeger	74b6b522ec	KVM: s390: fix locking order problem in enable_sie There are potential locking problem in enable_sie. We take the task_lock and the mmap_sem. As exit_mm uses the same locks vice versa, this triggers a lockdep warning. The second problem is that dup_mm and mmput might sleep, so we must not hold the task_lock at that moment. The solution is to dup the mm unconditional and use the task_lock before and afterwards to check if we can use the new mm. dup_mm and mmput are called outside the task_lock, but we run update_mm while holding the task_lock, protection us against ptrace. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-06-06 21:08:26 +03:00
Heiko Carstens	c1bb7f31ea	[S390] showmem: Only walk spanned pages. Convert show_mem() so its nearly the same as on x86/powerpc. Gives us proper locking and we get also rid of the only use of max_mapnr. Also the number of pages was contained in an int which might not be sufficient not too far in the future. Cc: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-05-30 10:03:34 +02:00
Heiko Carstens	67060d9c1f	[S390] Fix section mismatch warnings. This fixes the last remaining section mismatch warnings in s390 architecture code. It reveals also a real bug introduced by... me with git commit `2069e978d5` ("[S390] sparsemem vmemmap: initialize memmap.") Calling the generic vmemmap_alloc_block() function to get initialized memory is a nice idea, however that function is __meminit annotated and therefore the function might be gone if we try to call it later. This can happen if a DCSS segment gets added. So basically revert the patch and clear the memmap explicitly to fix the original bug. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-05-30 10:03:34 +02:00
Heiko Carstens	2069e978d5	[S390] sparsemem vmemmap: initialize memmap. Let's just use the generic vmmemmap_alloc_block() function which always returns initialized memory. Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-05-15 16:52:38 +02:00
Martin Schwidefsky	45e576b1c3	[S390] guest page hinting light Use the existing arch_alloc_page/arch_free_page callbacks to do the guest page state transitions between stable and unused. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-05-07 09:23:02 +02:00
Heiko Carstens	17f3458085	[S390] Convert to SPARSEMEM & SPARSEMEM_VMEMMAP Convert s390 to SPARSEMEM and SPARSEMEM_VMEMMAP. We do a select of SPARSEMEM_VMEMMAP since it is configurable. This is because SPARSEMEM without SPARSEMEM_VMEMMAP gives us a hell of broken include dependencies that I don't want to fix. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-04-30 13:38:48 +02:00
Gerald Schaefer	53492b1de4	[S390] System z large page support. This adds hugetlbfs support on System z, using both hardware large page support if available and software large page emulation on older hardware. Shared (large) page tables are implemented in software emulation mode, by using page->index of the first tail page from a compound large page to store page table information. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-04-30 13:38:47 +02:00
Heiko Carstens	8fc6365868	[S390] vmemmap: use clear_table to initialise page tables. Always use clear_table to initialise page tables. The overlapping memcpy is just a leftover of a previous version that wasn't fully converted to clear_table. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-04-30 13:38:46 +02:00
Carsten Otte	402b08622d	s390: KVM preparation: provide hook to enable pgstes in user pagetable The SIE instruction on s390 uses the 2nd half of the page table page to virtualize the storage keys of a guest. This patch offers the s390_enable_sie function, which reorganizes the page tables of a single-threaded process to reserve space in the page table: s390_enable_sie makes sure that the process is single threaded and then uses dup_mm to create a new mm with reorganized page tables. The old mm is freed and the process has now a page status extended field after every page table. Code that wants to exploit pgstes should SELECT CONFIG_PGSTE. This patch has a small common code hit, namely making dup_mm non-static. Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's review feedback. Now we do have the prototype for dup_mm in include/linux/sched.h. Following Martin's suggestion, s390_enable_sie() does now call task_lock() to prevent race against ptrace modification of mm_users. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:40 +03:00
Martin Schwidefsky	ca68305bf3	[S390] Remove code duplication from monreader / dcssblk. Move the function that prints the segment warning messages found in the monreader driver and the dcssblk driver to the extmem base code. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:47:07 +02:00
Heiko Carstens	a806170e29	[S390] Fix a lot of sparse warnings. Most noteable part of this commit is the new local header file entry.h which contains all the function declarations of functions that get only called from asm code or are arch internal. That way we can avoid extern declarations in C files. This is more or less the same that was done for sparc64. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:47:06 +02:00
Johannes Weiner	1e42f32785	[S390] remove redundant display of free swap space in show_mem() Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:47:04 +02:00
Heiko Carstens	c2e8b8531b	[S390] exec_protect: Fix incorrect extern declarations. sys_sigreturn and sys_rt_sigreturn don't take any arguments. So luckily this resulted only in unneeded instead of incorrect code. But still this clearly shows why one should not put extern declarations in C files (will be fixed with a larger sparse patch). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:46:59 +02:00
Martin Schwidefsky	6252d702c5	[S390] dynamic page tables. Add support for different number of page table levels dependent on the highest address used for a process. This will cause a 31 bit process to use a two level page table instead of the four level page table that is the default after the pud has been introduced. Likewise a normal 64 bit process will use three levels instead of four. Only if a process runs out of the 4 tera bytes which can be addressed with a three level page table the fourth level is dynamically added. Then the process can use up to 8 peta byte. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-09 18:24:41 +01:00
Martin Schwidefsky	5a216a2083	[S390] Add four level page tables for CONFIG_64BIT=y. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-09 18:24:40 +01:00
Martin Schwidefsky	146e4b3c8b	[S390] 1K/2K page table pages. This patch implements 1K/2K page table pages for s390. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-09 18:24:40 +01:00
Martin Schwidefsky	2f569afd9c	CONFIG_HIGHPTE vs. sub-page page tables. Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte ) - to be introduced with a later patch. For everybody else it will be a (struct page ). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:42 -08:00
Heiko Carstens	0189103c69	[S390] Remove BUILD_BUG_ON() in vmem code. Remove BUILD_BUG_ON() in vmem code since it causes build failures if the size of struct page increases. Instead calculate at compile time the address of the highest physical address that can be added to the 1:1 mapping. This supposed to fix a build failure with the page owner tracking leak detector patches as reported by akpm. page-owner-tracking-leak-detector-broken-on-s390.patch can be removed from -mm again when this is merged. Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:51:01 +01:00
Heiko Carstens	2bc89b5ece	[S390] Fix couple of section mismatches. Fix couple of section mismatches. And since we touch the code anyway change the IPL code to use C99 initializers. Cc: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:50:56 +01:00
Heiko Carstens	2485579bf5	[S390] DEBUG_PAGEALLOC support for s390. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:50:54 +01:00
Christian Borntraeger	a2fd64d6aa	[S390] vmemmap: allocate struct pages before 1:1 mapping We have seen an oops in an OOM situation, where show_mem tried to access the struct page of a dcss segment. The vmemmap code has already created the 1:1 mapping but failed allocating the struct pages. In the OOM case, show_mem now walks the memory. It uses pfn_valid to detect if it may access the struct page. In the case described above, the mapping was established and pfn_valid returned true. As the struct pages were not allocated, the kernel oopsed. We have to ensure that we have created the struct pages, before we add a mapping pointing to the pages. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:23 +01:00
Denis Cheng	c11ca97ee9	[S390] use LIST_HEAD instead of LIST_HEAD_INIT single list_head variable initialized with LIST_HEAD_INIT could almost always can be replaced with LIST_HEAD declaration, this shrinks the code and looks better. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:21 +01:00
Heiko Carstens	9f4b0ba81f	[S390] Get rid of HOLES_IN_ZONE requirement. Align everything to MAX_ORDER so we can get rid of the extra checks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:13 +01:00
Christian Borntraeger	5fd9c6e214	[S390] Change vmalloc defintions Currently the vmalloc area starts at a dynamic address depending on the memory size. There was also an 8MB security hole after the physical memory to catch out-of-bounds accesses. We can simplify the code by putting the vmalloc area explicitely at the top of the kernel mapping and setting the vmalloc size to a fixed value of 128MB/128GB for 31bit/64bit systems. Part of the vmalloc area will be used for the vmem_map. This leaves an area of 96MB/1GB for normal vmalloc allocations. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:12 +01:00
Heiko Carstens	43ebbf119a	[S390] cmm: remove unused binary sysctls. Remove binary sysctls that never worked due to missing strategy functions. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-11-20 11:13:45 +01:00
Martin Schwidefsky	190a1d722a	[S390] 4level-fixup cleanup Get independent from asm-generic/4level-fixup.h Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-22 12:52:49 +02:00
Martin Schwidefsky	3610cce87a	[S390] Cleanup page table definitions. - De-confuse the defines for the address-space-control-elements and the segment/region table entries. - Create out of line functions for page table allocation / freeing. - Simplify get_shadow_xxx functions. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-22 12:52:49 +02:00
Serge E. Hallyn	b460cbc581	pid namespaces: define is_global_init() and is_container_init() is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [akpm@linux-foundation.org: fix comment] [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c] [bunk@stusta.de: kernel/pid.c: remove unused exports] [sukadev@us.ibm.com: Fix capability.c to work with threaded init] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
David Rientjes	5a3135c2e7	oom: move prototypes to appropriate header file Move the OOM killer's extern function prototypes to include/linux/oom.h and include it where necessary. [clg@fr.ibm.com: build fix] Cc: Andrea Arcangeli <andrea@suse.de> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Will Schmidt	dcca2bde4f	During VM oom condition, kill all threads in process group We have had complaints where a threaded application is left in a bad state after one of it's threads is killed when we hit a VM: out_of_memory condition. Killing just one of the process threads can leave the application in a bad state, whereas killing the entire process group would allow for the application to restart, or be otherwise handled, and makes it very obvious that something has gone wrong. This change allows the entire process group to be taken down, rather than just the one thread. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:52 -07:00
Heiko Carstens	c41fbc6965	[S390] pfault: Fix alignment of parameter list. Make sure parameter list of the pfault token function is eight byte aligned. Otherwise we can get specification exceptions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-12 16:13:11 +02:00

1 2 3

114 Commits