Mixed-programming with CUDA C to create DLL for Excel

Hello All,

In the past, I have successfully created Fortran DLLs with OpenMP for use with Excel VBA. However, I would now like to integrate some CUDA C GPU code. I am trying to use the Fortran 2003 C interoperability features to make Intel Fortran talk to CUDA C. I have been able to create an executable which shows the expected behavior. However, when I compile it as a DLL and use inside Excel, it crashes without warning. There is no diagnostic information whatsoever. If anyone has observed this behavior and found a workaround, I would be glad to get any kind of help. My development configuration and test code are as follows.

Thanks in advance,

Sam V

Build setup: Win 6 x64; Microsoft Excel 2010 VBA; Intel Composer XE 2013 IA-32 with Visual Studio 2008; NVIDIA CUDA C v5.5

Example code:

Fortran code (excelcuda.f90)
uncommenting/commenting relevant lines for compilation as an executable)

!program main
!implicit none
!real*4::xx(4),yy(4)
!xx=1.D0
!yy=2.D0
!write(*,*) xx, yy
!call myarrtest(xx,yy,4)
!write(*,*) xx, yy
!end program


subroutine myarrtest(arrin,arrout,sz1)

!DEC$ ATTRIBUTES DLLEXPORT,STDCALL,REFERENCE,DECORATE,ALIAS:'myarrtest'::myarrtest
!DEC$ ATTRIBUTES REFERENCE::arrin,arrout,sz1

USE, INTRINSIC :: ISO_C_BINDING
implicit none

INTERFACE
    SUBROUTINE kernel_wrapper (flt_a, flt_b, int_n) BIND(C)
    IMPORT
    INTEGER(C_INT), INTENT(IN) :: int_n
    REAL(C_FLOAT), INTENT(IN) :: flt_a(int_n), flt_b(int_n)
    END SUBROUTINE kernel_wrapper
END INTERFACE

integer*4::i
integer*4,intent(in)::sz1
real*4,dimension(sz1),intent(in)::arrin
real*4,dimension(sz1),intent(out)::arrout

!do i=1,sz1
!arrout(i)=arrin(i)+arrout(i)
!end do

CALL kernel_wrapper(arrout, arrin, sz1)

end subroutine

CUDA C kernel (cudakernel.cu)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <cuda.h>
#include <cuda_runtime.h>


// simple kernel function that adds two vectors
__global__ void vect_add(float *a, float *b, int N)
{
   int idx = threadIdx.x;
   if (idx<N) a[idx] = a[idx] + b[idx];
}

// function called from main fortran program
extern "C" void kernel_wrapper(float *a, float *b, int *Np)
{
   float  *a_d, *b_d;  // declare GPU vector copies

   int blocks = 1;     // uses 1 block of
   int N = *Np;        // N threads on GPU

   // Allocate memory on GPU
   cudaMalloc( (void **)&a_d, sizeof(float) * N );
   cudaMalloc( (void **)&b_d, sizeof(float) * N );

   // copy vectors from CPU to GPU
   cudaMemcpy( a_d, a, sizeof(float) * N, cudaMemcpyHostToDevice );
   cudaMemcpy( b_d, b, sizeof(float) * N, cudaMemcpyHostToDevice );

   // call function on GPU
   vect_add<<< blocks, N >>>( a_d, b_d, N);

   // copy vectors back from GPU to CPU
   cudaMemcpy( a, a_d, sizeof(float) * N, cudaMemcpyDeviceToHost );
   cudaMemcpy( b, b_d, sizeof(float) * N, cudaMemcpyDeviceToHost );

   // free GPU memory
   cudaFree(a_d);
   cudaFree(b_d);
   return;
}

The above pieces of code was compiled using the following commands

nvcc -c -m32 -O3 cudakernel.cu
ifort -dll -libs:dll -iface:stdcall excelcuda.f90 cudakernal.obj cuda.lib cudart.lib

The resulting DLL is used within Excel VBA using the following statements

Declare Sub myarrtest Lib "excelcuda.dll" (ByRef x As Single, ByRef y As Single, ByRef n As Long)
...
...
Call myarrtest(vbarr(1), fortarr(1), n1)
...
...

Mixed-programming with CUDA C to create DLL for Excel

Trending Articles

Black Angus Grilled Artichokes

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

RNS 510 C14 bricked after NAND erase

Online এ তৈরি করুন Fake Smart ID Card

Download: Bicko Bicko ft Rich Bizzy & Crew G- Wanfulanganya (Prod by: Bicko...

Karimnagar District Police Office Mobile Numbers List in Telangana State

Moondru Mudichu 15-07-2015 – Polimer tv Serial

Aoi Teshima – Mori no Chiisana Restaurant – Single [iTunes Plus M4A]

SMOKO ROBERT T. AGE 62, OF FAR...

Kfar Chabad Alert – Chaim Gajer –חיים גייר

Attharintiki Daaredhi: Bappu Gari Bommo Lyrics Translation

Kusvirana kana mambotukana kunonaka sei? – Makwirirwo anodiwa nevakadzi!

ZARIA CUMMINGS

Practice Sheet of Right form of verbs for HSC Students

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

In Court: Cases heard at Central Devon Magistrates' Court

Malcolm Todd – Chest Pain (I Love) – Single [iTunes Plus M4A]

Michel Roux roast duck with cherries, cherry sauce and potatoes recipe on...

Punjab School Education Board Latest Exam Result 2016 www.pseb.ac.in

99 God Status for Whatsapp, Facebook