cudnn.torch

Torch7 FFI bindings for NVidia CuDNN (R3) kernels!

Modules are API compatible their nn equivalents. Fully unit-tested against nn implementations.

Installation

Install CuDNN (version R3)
Have at least Cuda 7.0
Have libcudnn.so in your library path (Install it from https://developer.nvidia.com/cuDNN )

Modules

-- All inputs have to be 3D or 4D(batch-mode), except ReLU, Tanh and Sigmoid
cudnn.SpatialConvolution(nInputPlane, nOutputPlane, kW, kH, [dW = 1], [dH = 1], [padW = 0], [padH = 0], [groups = 1])
cudnn.SpatialMaxPooling(kW, kH, dW, dH, padW, padH)
cudnn.SpatialAveragePooling(kW, kH, dW, dH, padW, padH)

-- the pointwise functions take an additional optional argument. if inplace=true then they do operations in-place without using any extra memory for themselves
cudnn.ReLU(inplace[=false])
cudnn.Tanh(inplace[=false])
cudnn.Sigmoid(inplace[=false])

-- SoftMax can be run in fast mode or accurate mode. Default is accurate mode.
cudnn.SoftMax(fastMode [= false])          -- SoftMax across each image (just like nn.SoftMax)
cudnn.LogSoftMax()                         -- LogSoftMax across each image (just like nn.LogSoftMax)
cudnn.SpatialSoftMax(fastMode [= false])   -- SoftMax across feature-maps (per spatial location)
cudnn.SpatialLogSoftMax()                  -- LogSoftMax across feature-maps (per spatial location)

cudnn.SpatialCrossEntropyCriterion()       -- A spatial version of LogSoftMax + ClassNLLCriterion in one shot

-- Volumetric inputs (4D or 5D batched mode)
cudnn.VolumetricConvolution(nInputPlane, nOutputPlane, kT, kW, kH, dT, dW, dH, padT, padW, padH)
cudnn.VolumetricMaxPooling(kT, kW, kH, dT, dW, dH, padT, padW, padH)
cudnn.VolumetricAveragePooling(kT, kW, kH, dT, dW, dH, padT, padW, padH)

Modes

There are two globally availabe modes useful for tuning performance:

require 'cudnn'
cudnn.benchmark = true -- uses the inbuilt cudnn auto-tuner to find the fastest convolution algorithms.
                       -- If this is set to false, uses some in-built heuristics that might not always be fastest.

by default cudnn.benchmark is set to false.

cudnn.fastest = true -- this is like the :fastest() mode for the Convolution modules,
                     -- simply picks the fastest convolution algorithm, rather than tuning for workspace size

by default, cudnn.fastest is set to false.

cudnn.verbose = true -- this prints out some more verbose information useful for debugging

by default, cudnn.verbose is set to false.

Older versions

For version CuDNN R1, checkout the branch R1 For version CuDNN R2, checkout the branch R2

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
test		test
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
LogSoftMax.lua		LogSoftMax.lua
Pointwise.lua		Pointwise.lua
Pooling.lua		Pooling.lua
Pooling3D.lua		Pooling3D.lua
README.md		README.md
ReLU.lua		ReLU.lua
Sigmoid.lua		Sigmoid.lua
SoftMax.lua		SoftMax.lua
SpatialAveragePooling.lua		SpatialAveragePooling.lua
SpatialConvolution.lua		SpatialConvolution.lua
SpatialCrossEntropyCriterion.lua		SpatialCrossEntropyCriterion.lua
SpatialCrossMapLRN.lua		SpatialCrossMapLRN.lua
SpatialDivisiveNormalization.lua		SpatialDivisiveNormalization.lua
SpatialLogSoftMax.lua		SpatialLogSoftMax.lua
SpatialMaxPooling.lua		SpatialMaxPooling.lua
SpatialSoftMax.lua		SpatialSoftMax.lua
Tanh.lua		Tanh.lua
VolumetricAveragePooling.lua		VolumetricAveragePooling.lua
VolumetricConvolution.lua		VolumetricConvolution.lua
VolumetricMaxPooling.lua		VolumetricMaxPooling.lua
cudnn-scm-1.rockspec		cudnn-scm-1.rockspec
env.lua		env.lua
ffi.lua		ffi.lua
functional.lua		functional.lua
init.lua		init.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cudnn.torch

Installation

Modules

Modes

Older versions

About

Releases

Packages

Languages

License

phecy/cudnn.torch

Folders and files

Latest commit

History

Repository files navigation

cudnn.torch

Installation

Modules

Modes

Older versions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages