linux-2.6

dect

Archived

Author	SHA1	Message	Date
Xiao Guangrong	fce92dce79	KVM: MMU: filter out the mmio pfn from the fault pfn If the page fault is caused by mmio, the gfn can not be found in memslots, and 'bad_pfn' is returned on gfn_to_hva path, so we can use 'bad_pfn' to identify the mmio page fault. And, to clarify the meaning of mmio pfn, we return fault page instead of bad page when the gfn is not allowd to prefetch Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-24 11:50:34 +03:00
Glauber Costa	c9aaa8957f	KVM: Steal time implementation To implement steal time, we need the hypervisor to pass the guest information about how much time was spent running other processes outside the VM, while the vcpu had meaningful work to do - halt time does not count. This information is acquired through the run_delay field of delayacct/schedstats infrastructure, that counts time spent in a runqueue but not running. Steal time is a per-cpu information, so the traditional MSR-based infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the memory area address containing information about steal time This patch contains the hypervisor part of the steal time infrasructure, and can be backported independently of the guest portion. [avi, yongjie: export delayacct_on, to avoid build failures in some configs] Signed-off-by: Glauber Costa <glommer@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Rik van Riel <riel@redhat.com> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Yongjie Ren <yongjie.ren@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-14 12:59:14 +03:00
Gleb Natapov	e03b644fe6	KVM: introduce kvm_read_guest_cached Introduce kvm_read_guest_cached() function in addition to write one we already have. [ by glauber: export function signature in kvm header ] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric Munson <emunson@mgebm.net> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-12 13:17:01 +03:00
Linus Torvalds	5e152b4c9e	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits) PCI: Don't use dmi_name_in_vendors in quirk PCI: remove unused AER functions PCI/sysfs: move bus cpuaffinity to class dev_attrs PCI: add rescan to /sys/.../pci_bus/.../ PCI: update bridge resources to get more big ranges when allocating space (again) KVM: Use pci_store/load_saved_state() around VM device usage PCI: Add interfaces to store and load the device saved state PCI: Track the size of each saved capability data area PCI/e1000e: Add and use pci_disable_link_state_locked() x86/PCI: derive pcibios_last_bus from ACPI MCFG PCI: add latency tolerance reporting enable/disable support PCI: add OBFF enable/disable support PCI: add ID-based ordering enable/disable support PCI hotplug: acpiphp: assume device is in state D0 after powering on a slot. PCI: Set PCIE maxpayload for card during hotplug insertion PCI/ACPI: Report _OSC control mask returned on failure to get control x86/PCI: irq and pci_ids patch for Intel Panther Point DeviceIDs PCI: handle positive error codes PCI: check pci_vpd_pci22_wait() return PCI: Use ICH6_GPIO_EN in ich6_lpc_acpi_gpio ... Fix up trivial conflicts in include/linux/pci_ids.h: commit `a6e5e2be44` moved the intel SMBUS ID definitons to the i2c-i801.c driver.	2011-05-23 15:39:34 -07:00
Gleb Natapov	8fa2206821	KVM: make guest mode entry to be rcu quiescent state KVM does not hold any references to rcu protected data when it switches CPU into a guest mode. In fact switching to a guest mode is very similar to exiting to userspase from rcu point of view. In addition CPU may stay in a guest mode for quite a long time (up to one time slice). Lets treat guest mode as quiescent state, just like we do with user-mode execution. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-05-22 08:48:09 -04:00
Alex Williamson	f8fcfd7755	KVM: Use pci_store/load_saved_state() around VM device usage Store the device saved state so that we can reload the device back to the original state when it's unassigned. This has the benefit that the state survives across pci_reset_function() calls via the PCI sysfs reset interface while the VM is using the device. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Avi Kivity <avi@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2011-05-21 12:17:10 -07:00
Jeff Mahoney	b42fc3cbc3	KVM: Fix off by one in kvm_for_each_vcpu iteration This patch avoids gcc issuing the following warning when KVM_MAX_VCPUS=1: warning: array subscript is above array bounds kvm_for_each_vcpu currently checks to see if the index for the vcpu is valid /after/ loading it. We don't run into problems because the address is still inside the enclosing struct kvm and we never deference or write to it, so this isn't a security issue. The warning occurs when KVM_MAX_VCPUS=1 because the increment portion of the loop will always cause the loop to load an invalid location since ++idx will always be > 0. This patch moves the load so that the check occurs before the load and we don't run into the compiler warning. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-05-11 07:57:09 -04:00
Avi Kivity	cef4dea07f	KVM: 16-byte mmio support Since sse instructions can issue 16-byte mmios, we need to support them. We can't increase the kvm_run mmio buffer size to 16 bytes without breaking compatibility, so instead we break the large mmios into two smaller 8-byte ones. Since the bus is 64-bit we aren't breaking any atomicity guarantees. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-05-11 07:56:58 -04:00
Marcelo Tosatti	c761e5868e	Revert "KVM: Fix race between nmi injection and enabling nmi window" This reverts commit `f86368493e`. Simpler fix to follow. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-05-11 07:56:57 -04:00
Xiao Guangrong	0ee8dcb87e	KVM: cleanup memslot_id function We can get memslot id from memslot->id directly Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-05-11 07:56:53 -04:00
Avi Kivity	f86368493e	KVM: Fix race between nmi injection and enabling nmi window The interrupt injection logic looks something like if an nmi is pending, and nmi injection allowed inject nmi if an nmi is pending request exit on nmi window the problem is that "nmi is pending" can be set asynchronously by the PIT; if it happens to fire between the two if statements, we will request an nmi window even though nmi injection is allowed. On SVM, this has disasterous results, since it causes eflags.TF to be set in random guest code. The fix is simple; make nmi_pending synchronous using the standard vcpu->requests mechanism; this ensures the code above is completely synchronous wrt nmi_pending. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:30 -03:00
Rik van Riel	217ece6129	KVM: use yield_to instead of sleep in kvm_vcpu_on_spin Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to get another VCPU in the same KVM guest to run sooner. This seems to give a 10-15% speedup in certain workloads. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:29 -03:00
Rik van Riel	34bb10b79d	KVM: keep track of which task is running a KVM vcpu Keep track of which task is running a KVM vcpu. This helps us figure out later what task to wake up if we want to boost a vcpu that got preempted. Unfortunately there are no guarantees that the same task always keeps the same vcpu, so we can only track the task across a single "run" of the vcpu. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:29 -03:00
Xiao Guangrong	3cba41307a	KVM: make make_all_cpus_request() lockless Now, we have 'vcpu->mode' to judge whether need to send ipi to other cpus, this way is very exact, so checking request bit is needless, then we can drop the spinlock let it's collateral Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:26 -03:00
Xiao Guangrong	6b7e2d0991	KVM: Add "exiting guest mode" state Currently we keep track of only two states: guest mode and host mode. This patch adds an "exiting guest mode" state that tells us that an IPI will happen soon, so unless we need to wait for the IPI, we can avoid it completely. Also 1: No need atomically to read/write ->mode in vcpu's thread 2: reorganize struct kvm_vcpu to make ->mode and ->requests in the same cache line explicitly Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:26 -03:00
Avi Kivity	5c663a1534	KVM: Fix build error on s390 due to missing tlbs_dirty Make it available for all archs. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:30:50 +02:00
Takuya Yoshikawa	d4dbf47009	KVM: MMU: Make the way of accessing lpage_info more generic Large page information has two elements but one of them, write_count, alone is accessed by a helper function. This patch replaces this helper function with more generic one which returns newly named kvm_lpage_info structure and use it to access the other element rmap_pde. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:30:47 +02:00
Xiao Guangrong	a4ee1ca4a3	KVM: MMU: delay flush all tlbs on sync_page path Quote from Avi: \| I don't think we need to flush immediately; set a "tlb dirty" bit somewhere \| that is cleareded when we flush the tlb. kvm_mmu_notifier_invalidate_page() \| can consult the bit and force a flush if set. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:29:51 +02:00
Alexander Graf	27923eb19c	KVM: PPC: Fix compile warning KVM compilation fails with the following warning: include/linux/kvm_host.h: In function 'kvm_irq_routing_update': include/linux/kvm_host.h:679:2: error: 'struct kvm' has no member named 'irq_routing' That function is only used and reasonable to have on systems that implement an in-kernel interrupt chip. PPC doesn't. Fix by #ifdef'ing it out when no irqchip is available. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:29:42 +02:00
Michael S. Tsirkin	bd2b53b20f	KVM: fast-path msi injection with irqfd Store irq routing table pointer in the irqfd object, and use that to inject MSI directly without bouncing out to a kernel thread. While we touch this structure, rearrange irqfd fields to make fastpath better packed for better cache utilization. This also adds some comments about locking rules and rcu usage in code. Some notes on the design: - Use pointer into the rt instead of copying an entry, to make it possible to use rcu, thus side-stepping locking complexities. We also save some memory this way. - Old workqueue code is still used for level irqs. I don't think we DTRT with level anyway, however, it seems easier to keep the code around as it has been thought through and debugged, and fix level later than rip out and re-instate it later. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:29:38 +02:00
Jan Kiszka	1e001d49f9	KVM: Refactor IRQ names of assigned devices Cosmetic change, but it helps to correlate IRQs with PCI devices. Acked-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:29:21 +02:00
Jan Kiszka	0645211c43	KVM: Switch assigned device IRQ forwarding to threaded handler This improves the IRQ forwarding for assigned devices: By using the kernel's threaded IRQ scheme, we can get rid of the latency-prone work queue and simplify the code in the same run. Moreover, we no longer have to hold assigned_dev_lock while raising the guest IRQ, which can be a lenghty operation as we may have to iterate over all VCPUs. The lock is now only used for synchronizing masking vs. unmasking of INTx-type IRQs, thus is renames to intx_lock. Acked-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:29:20 +02:00
Jan Kiszka	d89f5eff70	KVM: Clean up vm creation and release IA64 support forces us to abstract the allocation of the kvm structure. But instead of mixing this up with arch-specific initialization and doing the same on destruction, split both steps. This allows to move generic destruction calls into generic code. It also fixes error clean-up on failures of kvm_create_vm for IA64. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:29:09 +02:00
Takuya Yoshikawa	515a01279a	KVM: pre-allocate one more dirty bitmap to avoid vmalloc() Currently x86's kvm_vm_ioctl_get_dirty_log() needs to allocate a bitmap by vmalloc() which will be used in the next logging and this has been causing bad effect to VGA and live-migration: vmalloc() consumes extra systime, triggers tlb flush, etc. This patch resolves this issue by pre-allocating one more bitmap and switching between two bitmaps during dirty logging. Performance improvement: I measured performance for the case of VGA update by trace-cmd. The result was 1.5 times faster than the original one. In the case of live migration, the improvement ratio depends on the workload and the guest memory size. In general, the larger the memory size is the more benefits we get. Note: This does not change other architectures's logic but the allocation size becomes twice. This will increase the actual memory consumption only when the new size changes the number of pages allocated by vmalloc(). Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:28:46 +02:00
Marcelo Tosatti	612819c3c6	KVM: propagate fault r/w information to gup(), allow read-only memory As suggested by Andrea, pass r/w error code to gup(), upgrading read fault to writable if host pte allows it. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:28:40 +02:00
Gleb Natapov	344d9588a9	KVM: Add PV MSR to enable asynchronous page faults delivery. Guest enables async PF vcpu functionality using this MSR. Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:12 +02:00
Gleb Natapov	49c7754ce5	KVM: Add memory slot versioning and use it to provide fast guest write interface Keep track of memslots changes by keeping generation number in memslots structure. Provide kvm_write_guest_cached() function that skips gfn_to_hva() translation if memslots was not changed since previous invocation. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:08 +02:00
Gleb Natapov	af585b921e	KVM: Halt vcpu if page it tries to access is swapped out If a guest accesses swapped out memory do not swap it in from vcpu thread context. Schedule work to do swapping and put vcpu into halted state instead. Interrupts will still be delivered to the guest and if interrupt will cause reschedule guest will continue to run another task. [avi: remove call to get_user_pages_noio(), nacked by Linus; this makes everything synchrnous again] Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:21:39 +02:00
Linus Torvalds	1765a1fe5d	Merge branch 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (321 commits) KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages KVM: Fix signature of kvm_iommu_map_pages stub KVM: MCE: Send SRAR SIGBUS directly KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED KVM: fix typo in copyright notice KVM: Disable interrupts around get_kernel_ns() KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address KVM: MMU: move access code parsing to FNAME(walk_addr) function KVM: MMU: audit: check whether have unsync sps after root sync KVM: MMU: audit: introduce audit_printk to cleanup audit code KVM: MMU: audit: unregister audit tracepoints before module unloaded KVM: MMU: audit: fix vcpu's spte walking KVM: MMU: set access bit for direct mapping KVM: MMU: cleanup for error mask set while walk guest page table KVM: MMU: update 'root_hpa' out of loop in PAE shadow path KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn() KVM: x86: Fix constant type in kvm_get_time_scale KVM: VMX: Add AX to list of registers clobbered by guest switch KVM guest: Move a printk that's using the clock before it's ready KVM: x86: TSC catchup mode ...	2010-10-24 12:47:25 -07:00
Jan Kiszka	d7a79b6c80	KVM: Fix signature of kvm_iommu_map_pages stub Breaks otherwise if CONFIG_IOMMU_API is not set. KVM-Stable-Tag. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:53:15 +02:00
Zachary Amsden	34c238a1d1	KVM: x86: Rename timer function This just changes some names to better reflect the usage they will be given. Separated out to keep confusion to a minimum. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:53:05 +02:00
Avi Kivity	3842d135ff	KVM: Check for pending events before attempting injection Instead of blindly attempting to inject an event before each guest entry, check for a possible event first in vcpu->requests. Sites that can trigger event injection are modified to set KVM_REQ_EVENT: - interrupt, nmi window opening - ppr updates - i8259 output changes - local apic irr changes - rflags updates - gif flag set - event set on exit This improves non-injecting entry performance, and sets the stage for non-atomic injection. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:50 +02:00
Joerg Roedel	c30a358d33	KVM: MMU: Add infrastructure for two-level page walker This patch introduces a mmu-callback to translate gpa addresses in the walk_addr code. This is later used to translate l2_gpa addresses into l1_gpa addresses. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:52:34 +02:00
Xiao Guangrong	365fb3fdf6	KVM: MMU: rewrite audit_mappings_page() function There is a bugs in this function, we call gfn_to_pfn() and kvm_mmu_gva_to_gpa_read() in atomic context(kvm_mmu_audit() is called under the spinlock(mmu_lock)'s protection). This patch fix it by: - introduce gfn_to_pfn_atomic instead of gfn_to_pfn - get the mapping gfn from kvm_mmu_page_get_gfn() And it adds 'notrap' ptes check in unsync/direct sps Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:48 +02:00
Xiao Guangrong	48987781eb	KVM: MMU: introduce gfn_to_page_many_atomic() function Introduce this function to get consecutive gfn's pages, it can reduce gup's overload, used by later patch Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:26 +02:00
Xiao Guangrong	887c08ac19	KVM: MMU: introduce hva_to_pfn_atomic function Introduce hva_to_pfn_atomic(), it's the fast path and can used in atomic context, the later patch will use it Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:51:26 +02:00
Arnd Bergmann	4b6a2872a2	kvm: add __rcu annotations Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2010-08-19 17:18:01 -07:00
Gleb Natapov	4a994358b9	KVM: Convert mask notifiers to use irqchip/pin instead of gsi Devices register mask notifier using gsi, but irqchip knows about irqchip/pin, so conversion from irqchip/pin to gsi should be done before looking for mask notifier to call. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-02 06:40:39 +03:00
Gleb Natapov	edba23e515	KVM: Return EFAULT from kvm ioctl when guest accesses bad area Currently if guest access address that belongs to memory slot but is not backed up by page or page is read only KVM treats it like MMIO access. Remove that capability. It was never part of the interface and should not be relied upon. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-02 06:40:33 +03:00
Avi Kivity	e36d96f7cf	KVM: Keep slot ID in memory slot structure May be used for distinguishing between internal and user slots, or for sorting slots in size order. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:47:07 +03:00
Avi Kivity	0719837c08	KVM: Reduce atomic operations on vcpu->requests Usually the vcpu->requests bitmap is sparse, so a test_and_clear_bit() for each request generates a large number of unneeded atomics if a bit is set. Replace with a separate test/clear sequence. This is safe since there is no clear_bit() outside the vcpu thread. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:47:06 +03:00
Avi Kivity	a8eeb04a44	KVM: Add mini-API for vcpu->requests Makes it a little more readable and hackable. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:47:05 +03:00
Avi Kivity	a1f4d39500	KVM: Remove memory alias support As advertised in feature-removal-schedule.txt. Equivalent support is provided by overlapping memory regions. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:47:00 +03:00
Dexuan Cui	2acf923e38	KVM: VMX: Enable XSAVE/XRSTOR for guest This patch enable guest to use XSAVE/XRSTOR instructions. We assume that host_xcr0 would use all possible bits that OS supported. And we loaded xcr0 in the same way we handled fpu - do it as late as we can. Signed-off-by: Dexuan Cui <dexuan.cui@intel.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:46:31 +03:00
Avi Kivity	d94e1dc9af	KVM: Get rid of KVM_REQ_KICK KVM_REQ_KICK poisons vcpu->requests by having a bit set during normal operation. This causes the fast path check for a clear vcpu->requests to fail all the time, triggering tons of atomic operations. Fix by replacing KVM_REQ_KICK with a vcpu->guest_mode atomic. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:37 +03:00
Huang Ying	bf998156d2	KVM: Avoid killing userspace through guest SRAO MCE on unmapped pages In common cases, guest SRAO MCE will cause corresponding poisoned page be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay the MCE to guest OS. But it is reported that if the poisoned page is accessed in guest after unmapping and before MCE is relayed to guest OS, userspace will be killed. The reason is as follows. Because poisoned page has been un-mapped, guest access will cause guest exit and kvm_mmu_page_fault will be called. kvm_mmu_page_fault can not get the poisoned page for fault address, so kernel and user space MMIO processing is tried in turn. In user MMIO processing, poisoned page is accessed again, then userspace is killed by force_sig_info. To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM and do not try kernel and user space MMIO processing for poisoned page. [xiao: fix warning introduced by avi] Reported-by: Max Asbock <masbock@linux.vnet.ibm.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:26 +03:00
Avi Kivity	0ee75bead8	KVM: Let vcpu structure alignment be determined at runtime vmx and svm vcpus have different contents and therefore may have different alignmment requirements. Let each specify its required alignment. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:29 +03:00
Gui Jianfeng	2a059bf444	KVM: Get rid of dead function gva_to_page() Nobody use gva_to_page() anymore, get rid of it. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:10 +03:00
Lai Jiangshan	90d83dc3d4	KVM: use the correct RCU API for PROVE_RCU=y The RCU/SRCU API have already changed for proving RCU usage. I got the following dmesg when PROVE_RCU=y because we used incorrect API. This patch coverts rcu_deference() to srcu_dereference() or family API. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by qemu-system-x86/8550: #0: (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm] #1: (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm] stack backtrace: Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27 Call Trace: [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm] [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm] [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm] [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm] [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel] [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm] [<ffffffff810a8692>] ? unlock_page+0x27/0x2c [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1 [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm] [<ffffffff81060cfa>] ? up_read+0x23/0x3d [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6 [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241 [<ffffffff810e416c>] ? do_sys_open+0x104/0x116 [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13 [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a [<ffffffff810021db>] system_call_fastpath+0x16/0x1b Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:01 +03:00
Takuya Yoshikawa	660c22c425	KVM: limit the number of pages per memory slot This patch limits the number of pages per memory slot to make us free from extra care about type issues. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:41 +03:00

1 2 3 4

156 Commits