{"id":8618,"date":"2024-06-19T06:14:39","date_gmt":"2024-06-19T06:14:39","guid":{"rendered":"https:\/\/www.infinitivehost.com\/knowledge-base\/?p=8618"},"modified":"2024-07-30T07:10:21","modified_gmt":"2024-07-30T07:10:21","slug":"fix-amd-driver-issues-after-vm-reboot-with-gpu-passthrough","status":"publish","type":"post","link":"https:\/\/www.infinitivehost.com\/knowledge-base\/fix-amd-driver-issues-after-vm-reboot-with-gpu-passthrough\/","title":{"rendered":"Fix AMD Driver Issues After VM Reboot with GPU Passthrough"},"content":{"rendered":"<div class='epvc-post-count'><span class='epvc-eye'><\/span>  <span class=\"epvc-count\"> 3,494<\/span><span class='epvc-label'> Views<\/span><\/div>\n<p>When dealing with issues related to AMD GPU drivers not loading after rebooting a VM configured with GPU passthrough on Virt-Manager, several factors could be at play. Here\u2019s a step-by-step guide to diagnose and potentially resolve this issue:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step-by-Step Troubleshooting Guide<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Verify Host Configuration for IOMMU and VFIO<\/strong><\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Enable IOMMU in BIOS<\/strong>:\n<ul class=\"wp-block-list\">\n<li>For Intel systems, look for <code>VT-d<\/code> in BIOS settings.<\/li>\n\n\n\n<li>For AMD systems, look for <code>AMD-Vi<\/code> or <code>SVM<\/code>.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Enable IOMMU in the Kernel<\/strong>:<br>Edit <code>\/etc\/default\/grub<\/code> and add the following to <code>GRUB_CMDLINE_LINUX_DEFAULT<\/code>: <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">intel_iommu=on iommu=pt # For Intel amd_iommu=on iommu=pt # For AMD<\/mark><\/code> Update GRUB and reboot: <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">sudo update-grub sudo reboot<\/mark><\/code><\/li>\n\n\n\n<li><strong>Bind GPU to VFIO-PCI<\/strong>:<br>Identify your GPU and audio device&#8217;s IDs with: <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">lspci -nn | grep -i vga lspci -nn | grep -i audio<\/mark><\/code> Create or edit <code>\/etc\/modprobe.d\/vfio.conf<\/code>: <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">options vfio-pci ids=xxxx:yyyy,aaaa:bbbb<\/mark><\/code> Replace <code>xxxx:yyyy<\/code> and <code>aaaa:bbbb<\/code> with the respective IDs.<\/li>\n\n\n\n<li><strong>Blacklist Host GPU Drivers<\/strong>:<br>Prevent the host from loading its own drivers for the GPU. Add the following to <code>\/etc\/modprobe.d\/blacklist.conf<\/code>:<br><code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">bash blacklist amdgpu blacklist radeon<\/mark><\/code><br>Rebuild the initramfs and reboot:<br><code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">bash sudo update-initramfs -u sudo reboot<\/mark><\/code><\/li>\n<\/ul>\n\n\n\n<p>     2. <strong>Verify the GPU Device Isolation<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ensure the GPU is correctly isolated and not bound by the host&#8217;s drivers. Check the output of <code>lspci -k<\/code> and confirm the GPU is using <code>vfio-pci<\/code> drivers.<\/li>\n<\/ul>\n\n\n\n<p>     3. <strong>Check VM Configuration in Virt-Manager<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>PCI Host Device<\/strong>:<br>Verify that the GPU is added as a &#8220;PCI Host Device&#8221; in the VM configuration under &#8220;Add Hardware.&#8221;<\/li>\n\n\n\n<li><strong>Firmware<\/strong>:<br>Ensure the VM is using <code>Q35<\/code> chipset and <code>OVMF<\/code> (UEFI) firmware, which are often required for modern GPU passthrough.<\/li>\n<\/ul>\n\n\n\n<p>    4. <strong>Monitor and Analyze VM Logs<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Check VM logs for errors related to GPU passthrough: <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">cat \/var\/log\/libvirt\/qemu\/&lt;vm-name&gt;.log<\/mark><\/code><\/li>\n\n\n\n<li>Review host system logs using:<br><code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">bash sudo dmesg | grep -i iommu sudo journalctl -xe | grep -i vfio<\/mark><\/code><\/li>\n<\/ul>\n\n\n\n<p>    5. <strong>Ensure Proper GPU Reset Handling<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Reset Scripts<\/strong>:<br>Some GPUs require a reset to be properly reinitialized after VM shutdown. A reset script might look like: <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">echo 1 &gt; \/sys\/bus\/pci\/devices\/0000:xx:00.0\/remove echo 1 &gt; \/sys\/bus\/pci\/rescan<\/mark><\/code> Replace <code>xx:00.0<\/code> with your GPU\u2019s PCI address.<\/li>\n\n\n\n<li><strong>Libvirt Hooks<\/strong>:<br>Automate GPU reset by placing the script in <code>\/etc\/libvirt\/hooks\/qemu<\/code>, for example:<br><code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">bash #!\/bin\/bash if [ \"$1\" = \"your-vm-name\" ] &amp;&amp; [ \"$2\" = \"stopped\" ]; then echo 1 &gt; \/sys\/bus\/pci\/devices\/0000:xx:00.0\/remove echo 1 &gt; \/sys\/bus\/pci\/rescan fi<\/mark><\/code><br>Make sure to give it execute permissions:<br><code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">bash sudo chmod +x \/etc\/libvirt\/hooks\/qemu<\/mark><\/code><\/li>\n<\/ul>\n\n\n\n<p>     6. <strong>Update AMD Drivers in the VM<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Guest OS<\/strong>:<br>Inside the VM, update to the latest AMD drivers available for your OS. This can usually be done through package managers or by downloading from AMD\u2019s website.<\/li>\n\n\n\n<li><strong>Ensure Compatibility<\/strong>:<br>Make sure the guest OS and its drivers are compatible with your GPU and the virtualization setup.<\/li>\n<\/ul>\n\n\n\n<p>    7. <strong>Test with a Different Kernel or GPU<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><\/li>\n<\/ol>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Kernel<\/strong>:<br>Some kernel versions may have better support for GPU passthrough. Test with a newer or different kernel version.<\/li>\n\n\n\n<li><strong>GPU<\/strong>:<br>If possible, try using a different GPU to determine if the issue is specific to your current GPU model.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Commonly Used Commands and Logs<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Checking IOMMU Groups<\/strong>:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>  <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">find \/sys\/kernel\/iommu_groups\/ -type l<\/mark><\/code><\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Listing PCI Devices and Their Drivers<\/strong>:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>  <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">lspci -k<\/mark><\/code><\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>System Logs<\/strong>:<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>  <code><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-red-color\">sudo dmesg | grep -i vfio\n  sudo journalctl -xe | grep -i iommu<\/mark><\/code>\n<\/code><\/pre>\n\n\n\n<p><strong>Conclusion<\/strong><\/p>\n\n\n\n<p>AMD GPU drivers are not loading after a virtual machine reboot with GPU Passthrough in a virtualized environment on the <a href=\"https:\/\/www.infinitivehost.com\/gpu-dedicated-server\"><strong><mark style=\"background-color:#8ed1fc\" class=\"has-inline-color\">best GPU dedicated server<\/mark><\/strong><\/a>. It can be challenging due to driver and hardware reset issues. After verifying the IOMMU and VFIO configuration settings, VM configuration, handling GPU reset, monitoring, and analyzing VM logs, you can resolve the overall issue of AMD Driver not loading after the VM reboot. If you face this issue, then you can follow the instructions to remove the errors.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>3,494 Views When dealing with issues related to AMD GPU drivers not loading after rebooting a VM configured with GPU passthrough on Virt-Manager, several factors could be at play. Here\u2019s a step-by-step guide to diagnose and potentially resolve this issue: Step-by-Step Troubleshooting Guide 2. Verify the GPU Device Isolation 3. Check VM Configuration in Virt-Manager [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[202],"tags":[],"class_list":["post-8618","post","type-post","status-publish","format-standard","hentry","category-gpu-server"],"_links":{"self":[{"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/posts\/8618","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/comments?post=8618"}],"version-history":[{"count":2,"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/posts\/8618\/revisions"}],"predecessor-version":[{"id":8790,"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/posts\/8618\/revisions\/8790"}],"wp:attachment":[{"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/media?parent=8618"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/categories?post=8618"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.infinitivehost.com\/knowledge-base\/wp-json\/wp\/v2\/tags?post=8618"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}