Skip to main content

Calling C# DLLs from C++ — survey + LeYuECat case study

Trigger (2026-04-23): A .NET Framework 4.8 WPF semiconductor equipment control app needs to ship its C# motion/IO logic to a C++ customer. Outcome (2026-04-30): LeYu.ECat.Cpp.Wrapper (internal codename LeYuECat) shipped, wrapping LeYuEQ.Plugin.Motion.Delta.EtherCAT into an OO API that customer C++ code uses with a single #include <LeYuECat.h>. This article surveys the options first, then walks the implementation.


Bottom line

A C# DLL is a managed assembly. C++ cannot LoadLibrary it directly. You always need a bridge that exposes managed code through a C ABI (standard C calling convention).

For an existing .NET Framework 4.8 C# codebase, a C++/CLI wrapper is realistically the only painless, officially supported, long-term maintainable option. Implementation details below.


Four approaches compared

ApproachSupported .NET versionsMechanismDeployment complexityPerformanceBest for
C++/CLI WrapperFramework 4.8 ✅ / .NET 5+ ✅/clr mode hosts both managed + nativeMedium (extra DLL)CLR startup overheadIndustrial control / equipment software, existing .NET Framework ecosystem
Native AOT + UnmanagedCallersOnly.NET 7+ only ❌ FrameworkAOT-compile to a true native DLLLow (single DLL)Best (no CLR)New projects, tooling, UE5 integration
DNNE.NET 5+ only ❌ FrameworkAuto-generated native shim + boots the CLRHigh (three-piece set)MediumWhen you want neither AOT nor C++/CLI
DllExport (community)Framework 4.8 ✅IL post-processing injects exportsLow (single DLL)MediumFramework 4.8 without writing C++/CLI

Approach details

Approach 1: C++/CLI bridge layer (what LeYuECat actually uses)

Write a C++/CLI project (/clr mode) that serves both sides:

  • Inward: #using <Foo.dll> to reference the C# DLL directly; write managed code that calls C# objects
  • Outward: expose a standard C ABI (extern "C" __declspec(dllexport)) to native C++

Pros

  • Most mature; native Visual Studio support
  • Complex types (classes, strings, arrays) feel natural to use
  • Supports .NET Framework 4.8
  • The C# side can keep holding GC objects and throwing exceptions; all cross-boundary conversion is concentrated in this layer

Cons

  • Locked to Windows + .NET
  • CLR load cost
  • One more DLL in the dist (the CppCli middle layer)

Approach 2: .NET Native AOT + UnmanagedCallersOnly

The official .NET 7+ solution. Mark C# methods with [UnmanagedCallersOnly] and Native-AOT-compile to a real native DLL that C++ can DllImport / LoadLibrary directly.

[UnmanagedCallersOnly(EntryPoint = "Add",
CallConvs = new[] { typeof(CallConvCdecl) })]
public static int Add(int a, int b) => a + b;

Pros

  • Output is a pure native DLL — no CLR required
  • Fast startup, usable from any C++ program

Cons

  • Requires .NET 7/8+; .NET Framework 4.8 is not supported
  • The cross-boundary API must be pure C ABI (no string, List<T>, exceptions, or GC objects)
  • You have to move the entire C# codebase to a new runtime

Approach 3: DNNE

AaronRobinsonMSFT/DNNE reads C# methods marked [UnmanagedCallersOnly] and auto-generates a native shim DLL (suffixed NE.dll). C++ calls the shim; the shim internally boots the .NET runtime to execute the managed code.

Pros

  • No C++/CLI; no AOT
  • Produces a standard C header + import lib

Cons

  • Deployment is a three-piece set: shim DLL + managed DLL + .runtimeconfig.json
  • Requires modern .NET (no Framework 4.8 support)

Approach 4: DllExport (community)

UnmanagedExports (Robert Giesecke) or 3F/DllExport. Uses IL post-processing to inject C# methods as DLL exports.

[DllExport("Count", CallingConvention = CallingConvention.StdCall)]
public static int Count(IntPtr stringPtr) { ... }

Pros

  • Native .NET Framework support
  • C++ uses it directly via LoadLibrary
  • Single DLL

Cons

  • Relies on a third-party post-build hack
  • Not officially supported
  • Some static analyzers complain
  • Maintenance risk

Decision tree

Where does your C# DLL run?
├─ .NET Framework 4.8 (the common case today)
│ ├─ Needs to be called by existing C++ codebase → C++/CLI Wrapper ⭐
│ └─ Only need to export a handful of functions → DllExport

└─ .NET 7/8+ (new project)
├─ Can rewrite, want best perf → Native AOT + UnmanagedCallersOnly ⭐
└─ Don't want AOT → DNNE

LeYuECat case study

The existing C# plugin for LeYu Delta EtherCAT (LeYuEQ.Plugin.Motion.Delta.EtherCAT, net48) needs to ship to a C++ customer. This is the actual outcome of going with C++/CLI.

Three-layer architecture

Customer C++ ──#include <LeYuECat.h>──▶ extern "C" __stdcall flat ABI


LeYu.ECat.CppCli.dll (C++/CLI /clr, x64, net48)
│ gcroot<Object^>

LeYu.ECat.Managed.dll (C# net48, Tomlyn)
│ P/Invoke

EtherCAT_DLL_x64.dll (Delta runtime)

Each layer has a clearly scoped responsibility:

  • LeYuECat.h (single header) — the entire surface the customer sees. Contains the extern "C" ECat_* function declarations, the OO wrappers (LeYu::EtherCATService / LeYu::Axis / …), an inline switch that maps Delta error codes to English strings, and EtherCATException (inheriting from std::runtime_error). The customer only needs this one .h file.
  • CppCli/ (thin /clr shim)Exports.cpp implements every ECat_* entry point; HandleTable.{h,cpp} maps opaque void* handles to managed objects; Logging.cpp converts native callbacks to managed delegates. No business logic.
  • Managed/ (C# net48) — where the real logic lives. The policy is to copy verbatim from the upstream plugin project, with a header comment that says "Do not diverge without syncing back" to keep this from forking locally.

Why C++/CLI

  • The main app is .NET Framework 4.8, which immediately rules out approaches 2 and 3 (AOT, DNNE)
  • Approach 4 (DllExport) can't elegantly handle "the C# side is OO, with multiple service/axis objects that the C++ side needs to hold". You'd end up hand-writing a handle table every time you call — which is C++/CLI work — so you might as well actually write C++/CLI
  • The C# side has lots of exceptions, strings, Task<T>, List<T> — C++/CLI's cross-boundary try/catch plus msclr::interop is the lowest-friction way to deal with all of it

Build configuration (minimum viable setup)

vcxproj three-piece set:

<ConfigurationType>DynamicLibrary</ConfigurationType>
<PlatformToolset>v143</PlatformToolset> <!-- or v145, must match the VS version -->
<CLRSupport>true</CLRSupport>
<TargetFrameworkVersion>v4.8</TargetFrameworkVersion>

Individual components that must be ticked in the VS Installer:

  • Desktop development with C++ (workload)
  • .NET Framework 4.8 targeting pack
  • C++/CLI support for v143 (or v145) build tools ← often forgotten; without it the build fails outright

x64 only. AnyCPU / Win32 both blow up (Delta's EtherCAT_DLL_x64.dll is 64-bit only).

Shipping layout (5 DLLs + 1 header)

dist/
bin/
LeYu.ECat.CppCli.dll ← /clr bridge
LeYu.ECat.CppCli.lib ← import lib for the customer to link against
LeYu.ECat.Managed.dll ← C# logic
EtherCAT_DLL_x64.dll ← Delta SDK
Tomlyn.dll ← managed dependency
include/
LeYuECat.h
redist/
VC_redist.x64.exe

The customer does #include <LeYuECat.h> + links LeYu.ECat.CppCli.lib; at runtime they drop everything under bin/ next to their .exe. No GAC, no .runtimeconfig.json, no registration step required.


Cross-boundary ground rules — mapped to LeYuECat

Things that cannot cross a C ABI directly

  • C# string / List<T> / Dictionary / any GC object
  • C# exceptions
  • C++ std::string / std::vector / C++ classes / C++ exceptions

Pattern 1: opaque handles + gcroot table to proxy GC objects

The C++ side only ever sees a void* (really just an incrementing integer); the actual managed objects are held by a map maintained inside the C++/CLI layer:

// HandleTable.cpp (excerpt)
struct Entry { gcroot<Object^>* root; };
static std::unordered_map<void*, Entry> g_table;
static std::mutex g_mutex;
static uintptr_t g_nextHandle = 1;

void* RegisterObject(Object^ obj) {
std::lock_guard<std::mutex> lock(g_mutex);
void* handle = reinterpret_cast<void*>(g_nextHandle++);
g_table[handle] = { new gcroot<Object^>(obj) };
return handle;
}

Object^ ResolveObject(void* handle) {
std::lock_guard<std::mutex> lock(g_mutex);
auto it = g_table.find(handle);
return it == g_table.end() ? nullptr : (Object^)(*(it->second.root));
}

void UnregisterObject(void* handle) {
std::lock_guard<std::mutex> lock(g_mutex);
auto it = g_table.find(handle);
if (it == g_table.end()) return;
delete it->second.root; // release gcroot so the GC can reclaim the managed object
g_table.erase(it);
}

Key points:

  • gcroot<Object^> cannot be stored as an STL map value directly — it's a managed-aware type, so it must be new'd on the heap and the map stores pointers
  • A handle made by reinterpret_casting an incrementing integer to void* is enough; you don't need an actual pointer. Never hand the customer the address of a real managed object
  • The whole table is protected by a mutex; Resolve does not extend object lifetime — gcroot keeps the object alive

Pattern 2: side tables keyed by handle

Pure managed objects can't cleanly carry the kind of state the C++ side cares about — "what config path was I created with", "have I already been explicitly shut down" — so LeYuECat keeps a few extra unordered_map<void*, T> in the CppCli layer:

// Exports.cpp (excerpt)
static std::unordered_map<void*, std::string> g_configPaths;
static std::unordered_map<void*, std::vector<void*>> g_serviceAxisHandles;
static std::unordered_set<void*> g_shutDownServices;
  • g_configPaths — each service handle remembers its own TOML path, and Initialize calls the InitializeAsync(string) overload with it. An earlier version "copied the TOML to a fixed filename next to the managed assembly", which races in the dual-card scenario
  • g_serviceAxisHandles — after a service Initialize, all axis handles are pre-registered in one shot; subsequent GetAxis(i) calls just hit the cache. Without this, high-frequency polling would grow the HandleTable every second
  • g_shutDownServices — customers commonly call Shutdown() explicitly and then the destructor calls Shutdown() again. Without this flag the global refcount goes negative

Pattern 3: process-wide refcounted resource across N instances

Delta's _ECAT_Master_Open / _ECAT_Master_Close are process-global singletons that can only be called once each, but the wrapper allows N service instances (one per EtherCAT card). The fix is a static refcount:

public static class EtherCATMasterLifetime
{
private static int _refCount;
private static ushort _cachedExistCard;
private static readonly SemaphoreSlim _gate = new SemaphoreSlim(1, 1);

// Test seam — defaults to the real DLL; tests swap these for lambdas
public static MasterOpenFunc OpenFunc = CEtherCAT_DLL.CS_ECAT_Master_Open;
public static MasterCloseFunc CloseFunc = CEtherCAT_DLL.CS_ECAT_Master_Close;

public static ushort AcquireFirstOpen(out ushort existCard)
{
_gate.Wait();
try {
int newCount = Interlocked.Increment(ref _refCount);
if (newCount == 1) {
existCard = 0;
ushort ret = OpenFunc(ref existCard); // 0→1 transition: actually call
_cachedExistCard = existCard;
return ret;
}
existCard = _cachedExistCard; // subsequent acquires: skip
return 0;
}
finally { _gate.Release(); }
}

public static ushort ReleaseLastClose() { /* symmetric: only Close on 1→0; clamp <0 and warn */ }
}

This is the most explosion-prone area of the project. A few lessons:

  • Use SemaphoreSlim for the gate, not lock — the upper service layer is async (InitializeAsync / ShutdownAsync), and the same gate needs to support WaitAsync (this version uses Wait(), but the semantics must remain consistent)
  • existCard must be cached — the 0→1 call is when Delta tells you "how many cards were found"; subsequent acquires must not call it again (it would be treated as a second init); they need to return the cached value
  • Negative refcounts will happen — customer code double-shutting-down, calling Destroy from inside an exception handler, etc. Clamp to 0 and log a warning; this is safer than throwing
  • Always leave a test seamOpenFunc / CloseFunc are public static delegates defaulting to the real P/Invoke. Tests swap them for lambdas in setUp and can fully validate the refcount semantics without hardware:
EtherCATMasterLifetime.OpenFunc = (ref ushort existCard) => {
Interlocked.Increment(ref _openCallCount);
existCard = 1;
return 0;
};
// Then run 8 concurrent AcquireFirstOpen calls and assert _openCallCount == 1

Pattern 4: native callback ↔ managed delegate

When a customer wants to plug in their own logger, they pass a __stdcall C function pointer into managed code. Marshal.GetDelegateForFunctionPointer does the conversion:

// Logging.cpp
int32_t __stdcall ECat_SetLogCallback(ECat_LogCallback cb)
{
if (cb == nullptr) {
LeYu::CppCli::ActiveLogger::Current = nullptr; // restore default FileLogger
return 0;
}
IntPtr fp(reinterpret_cast<void*>(cb));
auto del = safe_cast<LeYu::ECat::Logging::CallbackLogger::LogCallbackDelegate^>(
Marshal::GetDelegateForFunctionPointer(
fp,
LeYu::ECat::Logging::CallbackLogger::LogCallbackDelegate::typeid));
LeYu::CppCli::ActiveLogger::Current =
gcnew LeYu::ECat::Logging::CallbackLogger(del);
return 0;
}

The C# delegate type must match the C signature byte-for-byte:

// CallbackLogger.cs
public delegate void LogCallbackDelegate(int level, string category, string message);
// LeYuECat.h
typedef void (__stdcall *ECat_LogCallback)(int32_t level, const char* category, const char* message);

Notes:

  • The C# delegate is declared with string, but in the managed-to-native direction the P/Invoke marshaller automatically marshals string as LPStr (ANSI char*), which matches the C side's const char*
  • If the customer's callback throws, you must try/catch it on the managed side and swallow it — letting an exception cross the C ABI is UB

Pattern 5: every entry point is "try → translate to error code"

Managed exceptions cannot cross a C ABI. Every ECat_* follows this shape:

int32_t __stdcall ECat_Service_Initialize(void* handle, int32_t* outSuccess)
{
if (handle == nullptr || outSuccess == nullptr) return ERR_PARAMETER;
try {
auto svc = ResolveService(handle);
if (svc == nullptr) return ERR_PARAMETER;
bool ok = svc->InitializeAsync(...)->Result;
*outSuccess = ok ? 1 : 0;
return 0;
}
catch (LeYu::ECat::Core::HardwareException^ hex) {
return (int32_t)hex->ErrorCode; // business exception → real error code
}
catch (System::Exception^ ex) {
LeYu::CppCli::LogManagedException(ex);
return ERR_NOT_SUPPORT; // anything else → generic fallback
}
}

HardwareException carries the original Delta error code and can be returned as-is to the C++ side; any other exception is swallowed and logged, returning 0xF009 (ERR_ECAT_NOT_SUPPORT). The customer's OO wrapper throws EtherCATException whenever it sees a non-zero code, mapping the code to an English message — so the full chain is C# exception → error code → C++ exception, and the C ABI segment in between is strictly pure int32_t.

Pattern 6: marshalling strings and arrays

Strings:

// C++ → C#
System::String^ s = msclr::interop::marshal_as<System::String^>(configPath);

// C# → C++ output buffer (caller-allocated)
static void CopyStringToBuffer(System::String^ src, char* dst, int32_t dstLen) {
if (dst == nullptr || dstLen <= 0) return;
if (src == nullptr) { dst[0] = '\0'; return; }
std::string s = msclr::interop::marshal_as<std::string>(src);
strncpy(dst, s.c_str(), (size_t)(dstLen - 1));
dst[dstLen - 1] = '\0';
}

Arrays always use the shape T* outBuffer + int32_t bufferCount + int32_t* outActualCount, and the managed side copies into the buffer:

static void CopyDoubleArray(array<double>^ src, double* dst,
int32_t bufferCount, int32_t* outActualCount)
{
int32_t len = (src != nullptr) ? src->Length : 0;
if (outActualCount != nullptr) *outActualCount = len;
if (dst == nullptr || bufferCount <= 0 || src == nullptr) return;
int32_t copyCount = (len < bufferCount) ? len : bufferCount;
for (int32_t i = 0; i < copyCount; i++) dst[i] = src[i];
}

Do not pin a managed array and hand it to C++ for long-term use — pinning blocks GC compaction, and the C++ side has no good way to know when it can let go. Copy out instead — much cleaner.

Pattern 7: the customer-facing OO wrapper is a pure inline header

LeYu::Axis / LeYu::EtherCATService in LeYuECat.h are fully inline and zero-cost:

class EtherCATService {
public:
explicit EtherCATService(const std::string& path) : handle_(nullptr) {
detail::Check(::ECat_Service_CreateFromConfig(path.c_str(), &handle_));
}
~EtherCATService() {
if (handle_) { try { ::ECat_Service_Destroy(handle_); } catch(...) {} }
}

EtherCATService(const EtherCATService&) = delete; // not copyable
EtherCATService(EtherCATService&& o) noexcept // movable
: handle_(o.handle_) { o.handle_ = nullptr; }

bool Initialize() {
int32_t ok = 0;
detail::Check(::ECat_Service_Initialize(handle_, &ok));
return ok != 0;
}
Axis GetAxis(int32_t i) {
void* ah = nullptr;
detail::Check(::ECat_Service_GetAxis(handle_, i, &ah));
return Axis(ah);
}
private:
void* handle_;
};

Principles:

  • RAII: the destructor must always swallow exceptions, to avoid throwing again during stack unwinding
  • Service has unique ownership: copy is = deleted; move is kept
  • Axis is a value type: it only holds a void*; ownership is on the service; the destructor is a no-op — axes become invalid implicitly when the service is destroyed
  • All extern "C" calls go through detail::Check(int32_t): any non-zero result throws EtherCATException with the code mapped to an English string
  • The error code lookup table is inlined in the header: the customer doesn't need to consult separate docs; IDE go-to-definition reveals it

Things to avoid

  • Don't let a GC object's lifetime cross the boundary — use a handle table + gcroot<Object^> proxy
  • Don't let exceptions cross a C ABI — always try/catch in the managed/CLI layer and translate to error codes
  • Don't hold a managed pointer on the native side long-term — use GCHandle.Alloc(..., Pinned) or a handle table
  • Don't ignore CallingConvention mismatches — the stack will blow up and the error is extremely hard to diagnose
  • Don't mix x86/x64 — BadImageFormatException is this 99% of the time
  • Don't treat a process-singleton native init as instance-level — multi-instance scenarios will always blow up; you need refcounting
  • Don't pin managed arrays for long-term C++ use — use the "caller allocates the buffer, managed side copies" pattern instead
  • Don't couple tests to real hardware — make the cross-boundary P/Invoke delegates injectable static fields; tests swap them for lambdas
  • Don't use an obfuscator that rewrites module init / metadata (such as .NET Reactor's NecroBit) on a managed DLL that will be loaded by C++/CLI — it blows up in cctor during mixed-mode init with NRE → mscorlib recursive resource lookup → CLR FailFast. See .NET Reactor × C++/CLI pitfalls
  • Native AOT is not a universal answer: scenarios that need C++ classes, std::vector, natural exception propagation, COM registration, cross-process, etc. — these are not Native AOT cases

Concrete advice for semiconductor equipment control software

Scenario A: an existing Motion / PLC / Vision SDK (C++) needs to call C# control logic

C++/CLI Wrapper

The industrial standard. Native Visual Studio support. Easiest to debug for ASE/ChipMOS FAEs. LeYuECat takes this route — you can use it directly as a reference implementation.

Scenario B: a TwinCore middleware / UE5 visualization needs to call natively

.NET 8 + Native AOT + UnmanagedCallersOnly

Split it into a separate module. The output is a pure native DLL — the cleanest UE5 integration, with no CLR startup latency to affect real-time simulation.

Scenario C: just want to expose a single C# function to C++ without architectural changes

DllExport community package

Lowest effort, but you have to weigh the long-term maintenance risk (unofficial, IL post-processing).


References

LeYuECat internal project

  • D:\Documents\LeYu\Workspace\EtherCAT.Cpp.Wrapper (Forgejo: Leyu/EtherCAT.Cpp.Wrapper) — the case study implementation
    • src/CppHeader/include/LeYuECat.h — customer-facing header
    • src/CppCli/Exports.cpp / HandleTable.cpp / Logging.cpp/clr bridge implementation
    • src/Managed/Lifetime/EtherCATMasterLifetime.cs — refcount example
    • src/Managed/Managed.Tests/LifetimeTests.cs — delegate seam testing example
    • samples/03_TwoCards_Parallel/main.cpp — dual-card concurrent lifetime test

Follow-up pitfalls

C++/CLI bridge approach

Native AOT + UnmanagedCallersOnly (modern official approach)

DNNE (modern non-AOT approach)

  • AaronRobinsonMSFT/DNNE — GitHub Official README. Covers dnne-gen generating the native shim, the NE-suffixed binary, output .h / .lib, RID configuration, MSBuild properties like DnneWindowsExportsDef

DllExport (community approach for .NET Framework 4.8)

P/Invoke reverse references (C ABI boundary design)