d:\sw\devrel\SDK10\Compute\C\bin\win64\Release\FDTD3d.exe Starting... Set-up, based upon target device GMEM size... getTargetDeviceGlobalMemSize cudaGetDeviceCount cudaGetDeviceProperties calloc host_output malloc input malloc coeff generateRandomData FDTD on 376 x 376 x 376 volume with symmetric filter radius 4 for 5 timesteps... fdtdReference... calloc intermediate Host FDTD loop t = 0 t = 1 t = 2 t = 3 t = 4 fdtdReference complete calloc device_output fdtdGPU... cudaGetDeviceCount cudaSetDevice (device 0) cudaMalloc bufferOut cudaMalloc bufferIn cudaFuncGetAttributes set block size to 32x16 set grid size to 12x24 cudaMemcpy (HostToDevice) bufferIn cudaMemcpy (HostToDevice) bufferOut cudaMemcpyToSymbol (HostToDevice) stencil GPU FDTD loop t = 0 launch kernel t = 1 launch kernel t = 2 launch kernel t = 3 launch kernel t = 4 launch kernel cutilDeviceSynchronize cudaMemcpy (DeviceToHost) cutilDeviceReset fdtdGPU complete CompareData (tolerance 0.000100)...