This article concerns Thunderbolt 3 add-in cards. The situation is no better for Thunderbolt 4 however some technical details differ from what is written on this page.
I’ve long been intrigued by Thunderbolt add-in cards. Apparently regular looking PCIe expansion cards, but shipped with a mystery interface cable to the motherboard, of which there is a small list of supported models.
It’s not a secret that these cards may work in a motherboard which isn’t supported, but full functionality is not a given. I have spent the past few evenings trawling through many forums, reading about the many different experiences people are having, and have also purchased some hardware to play around with myself, so we can dig into these problems and see what (if any) solutions there are.
Common Problem #1: It doesn’t work at all
Card not detected, nor any attached devices, nada.
This is a nice easy one to fix. One of the pins on the mystery Thunderbolt header interface on the rear of the card is an “enable” (technically ‘force power’) signal. On the Gigabyte GC-TITAN RIDGE (one of the most popular cards for Thunderbolt hacking) it is regularly reported that you can “short pins 3 and 5”. Note that you are actually shorting pins 1 and 3. Pin 1 is clearly marked on the connector as being at the top of the card, and in the drawing of the connector.
This modification works thanks to a weak pull-up on the input at pin 3 meaning there is +3.3V present. It will not necessarily work for every AIC. In some cases a separate +3.3V supply will need to be applied (via a 10K resistor for example) to the “force power” pin, which on the GC-TITAN RIDGE is pin 1.
This will not work for all AICs. I have noticed that some (Alpine Ridge for example) cards come up with a PCI Vendor/Device ID of 0xFFFF/0xFFFF which would mean that they wouldn’t be enumerated by the system. These cards will need some more initialisation that I’ll be looking at in future.
Common Problem #2: When peripherals are connected at boot time, they’re not detected
But if I subsequently unplug it, and plug it back in, then it works?
This cause of this problem is that all Thunderbolt Add-in cards by default power up at security level “SL1” which means users have to manually approve attached peripherals. Because you likely don’t have any Thunderbolt options in your UEFI setup menu, you can’t change this either. You will have previously installed the Intel Thunderbolt software, seen your device in the list, and approved it, and it might have even sprung into life.
When you approve the device, the Intel Thunderbolt software enables the PCIe endpoint for the peripheral in question which brings the peripheral online. The problem is that it only does this on first approval and hot-plug events. We can prove this by opening up the Thunderbolt software with a non functioning device that was attached at boot time, un-approve it, and re-approve it. It now works.
Because your motherboard likely has no Thunderbolt support whatsoever in the UEFI, the boot time activation process is missing. It may be possible to somehow hack the Thunderbolt software to force it to activate approved devices at boot time, but this isn’t an ideal solution.
If you aren’t concerned about security, there is a good fix out there for some Thunderbolt 3 add-in cards. Björn Ruytenberg is well known for Thunderbolt security research and one of the things he’s produced (albeit for a topic completely unrelated to this post) is a Python script which can patch the Thunderbolt controller firmware to bring the card up in “SL0” mode (No security). The purpose of his hack is to create an “evil maid” Thunderbolt controller for performing DMA attacks. Us? we just want the darn thing to work.
I have personally confirmed this as a solution on the Gigabyte GC-TITAN RIDGE. By having SL0 as the default, all periperhals are automatically approved at the time of connection or system power-on. We wouldn’t need the Intel Thunderbolt software anymore but it doesn’t hurt to have it installed.
It does unfortunately require attaching an external programmer to the SPI chip on the card. I won’t be covering this subject as there are many guides out there already where Thunderbolt firmware is re-flashed for other reasons. One small note is that I was unable to patch the firmware already on the card, however the BIN file of the firmware on Gigabyte’s website patched and flashed no problem.
Common Problem #3: I can’t hot plug
Or perhaps you can hot plug on port 2, but not port 1. This is assuming you are not having problem #2.
This is because the Thunderbolt controller hasn’t been setup properly by the UEFI. To be able to hot plug, two things are required:
1) Memory space allocation
Thunderbolt controllers are typically allocated around 1GB of memory mapped IO space to allow the CPU to address any peripherals attached to it. Most PCIe peripherals (excluding GPUs) only use kilobytes, at most a few megabytes of memory mapped IO space.
From my own experimentation the UEFI on unsupported boards did allocate the Thunderbolt controller 10 megabytes of MMIO space if attached to the chipset PCIe lanes, which is enough to get basic peripherals working i.e. NICs, NVMe SSDs etc. This appears to because the Intel reference code does this by default and if you are lucky your motherboard manufacturer has left this alone. Anecdotally it appears that AMD do something similar.
2) Reserve PCIe bus indexes
The PCI specification, dating back to the beginning in the 1990s has a frustrating limitation that buses must be numbered in the order that they are enumerated, which is a fixed order based on the order they are physically and logically connected.
When we plug Thunderbolt peripherals in, we add more PCIe buses, which need to have an index number assigned. If that new bus appears in the middle of a tree of PCIe buses, then it cannot be used as there are no free indexes to assign to it, and buses cannot be dynamically re-numbered while the operating system is running.
A Thunderbolt supporting UEFI will “reserve” typically quite a large number, at least 60, of downstream bus indexes for each Thunderbolt port. It cannot go bonkers reserving huge numbers of bus indexes all over the place as there is a hard limit of 256 per system.
Just because your UEFI doesn’t reserve any bus numbers, that doesn’t mean you can’t hot plug
If you attach your Thunderbolt AIC to the highest numbered PCIe bus in your system, assuming there is sufficient MMIO mapping, and that you can physically get at it, you will be able to hot plug on the second port of the Thunderbolt AIC. This works because none of the bus index numbers after the second port are in use, therefore the operating system can assign them to the connected peripheral.
I was able to prove this in the below scenario:
On the above board (AsRock E3C246D4I-2T), the OCuLink port the Thunderbolt AIC is attached to is the highest numbered bus in the system, making port 2 on the AIC the absolute last bus, therefore another bus can be attached to it at run-time.
I wasn’t able to find any product which adapts from OCuLink to PCIe, fortunately I had previously designed my own. For the test I’m using a PCIe protocol XQD/CFExpress reader.
Another interesting thing to note is that if a device is connected at boot time (assuming the card is in SL0 mode), PCIe indexes will be reserved. The card could then be unplugged, then re-plugged later on just fine.
(But maybe not)
Another potential blocker is whether or not the motherboard manufacturer has enabled hot-plugging for the PCIe root port that you’re trying to attach the Thunderbolt AIC to. In my case they have, however this is not a given.
If hot-plug isn’t enabled for the root port in question, you still sort of can still hot plug. After plugging in a new Thunderbolt device, you’d have to open device manager and manually press the “Scan for hardware changes” button. Just as you would have to when plugging a disk into a non hot-plugging SATA port. This does not get around the bus numbering problem I covered earlier.
Operating system hacks to the rescue?
Unfortunately, this problem is where we hit the buffers. The Linux kernel is able to re-number PCIe buses with some advanced hacking – a solution is detailed here.
For Windows, we’re in serious difficulty. There are a couple of dead-end threads on OSR where people have tried to tackle this problem, essentially trying to create something like what Linux is able to do, renumbering buses at boot time. Windows does not do this, and by the looks of it, never will.
One commentator suggests it is possible to create a filter driver which re-numbers PCIe buses just before PCI.sys loads. Even if we could, we’d either have to run with driver signature enforcement disabled, or pay thousands of dollars to have the driver signed for 64-bit use. Also, as the commentator from Microsoft points out, it wouldn’t work in every scenario.
Correct configuration from the UEFI
Ultimately this is the best solution. Reserving resources for hot pluggable PCI systems is a problem which predates Thunderbolt, and UEFI solutions have been available (to OEMs at least) for a long time.
The Intel UEFI Reference Code has an option called “Extra Bus Reserved”, this allows for up to 7 bus indexes to be reserved for each PCIe downstream port on a bridge, well short of the minimum 60 which Thunderbolt initialisation normally allocates. It would’t be enough for daisy chaining or multi-slot Thunderbolt enclosures, but it would be enough for anything else.
Also present here is the option to enable hot-plug and to reserve MMIO space. It’s limited to 20MB but as I previously mentioned, that is enough for the majority of use-cases.
Unfortunately it is ridiculously unlikely that you will find these options readily available to you, supported, and actually working. Motherboard manufacturers have no reason to expose this kind of functionality to end-users.
What are you talking about? It works for me!
Some people apparently find that it all just works – this may be because the motherboard manufacturer has left some actual Thunderbolt initialisation code in the UEFI, or, they’ve just decided to be generous and put in some half decent PCIe hot-plugging initialisation code. Are you one of these people? Please tell me about it!
How else could we get around this problem?
For Linux, there is a solution, but for Windows, something will have to be hacked. At this time I don’t have sufficient understanding to know whether hacking Windows, or the UEFI is the best starting point. It definitely does look to be interesting fodder for a future project. Please get in touch with me if you have researched this subject.
What about the Thunderbolt header?
I haven’t spent much time talking about this because in the context of getting basic functionality working, it’s not a significant part of the problem. The header provides a collection of GPIO signals which form a back channel between the Thunderbolt controller and an SMM Module in the UEFI, which is invisible to the operating system.
The signals on the header mostly appear to be related to power management however there is one notable exception where it is possible for the system to vector into an SMI on hot-plug events. I am not clear why this is necessary and quite frankly it looks a bit nasty. I will be studying all of this in detail in future.
Some of the source code for this mechanism is disclosed in TianoCore. But not all of it. Inevitably vendors can add their own hacky little bits, potentially explaining why the stability of Thunderbolt varies from vendor to vendor.
I can’t yet prove it, but I’m always left with the feeling that Thunderbolt headers are more figleaf than technical necessity.
Is there any light at the end of the tunnel?
From reading through forums, it is clear that there are many people hoping that one day, some kind of Thunderbolt AIC will appear which can be popped into an AMD or that cranky old 3rd gen Intel system and work perfectly. I have been one of those people for years now.
Unfortunately as we can now see, there are already limitations baked into these platforms which make such an addition impossible without hacking. A PCIe enabled USB4 AIC (when, and if one ever surfaces) will not change the situation because it is still a hot pluggable PCIe system with all of the security worries of Thunderbolt and therefore constrained by every problem I’ve outlined on this page.
I’d bet the only difference we’ll see is that it has an ASMedia chip on it, instead of Intel. Otherwise it’ll be the same deal: Made by a motherboard manufacturer, requires a proprietary header, and has a short list of boards it’s supported in. At least that will include AMD boards.
There is nothing preventing Intel (and other stakeholders) from offering a vendor independent solution to these problems, however it would take years of discussion and engineering efforts from all of the relevant parties (of which there are many). Assuming this is already happening, a one-size-fits-all solution is likely still years, and a new motherboard away.
For now, it’s easier for OEMs to tell us to plug the Thunderbolt header cable in to the motherboard. Haven’t got the header on your board? Goodbye.