Tensors are multi-dimensional objects, and the essential data representation block of Deep Learning frameworks such as TensorFlow and PyTorch.
A scalar has zero dimensions, a vector has one dimension, a matrix has two dimensions and tensors have three or more. In practice, we oftentimes refer to scalars and vectors and matrices as tensors as well for convenience.
Note: A tensor can also be any n-dimensional array, just like a Numpy array can. Many frameworks have support for working with Numpy arrays, and many of them are built on top of Numpy so the integration is both natural and efficient.
However, a torch.Tensor
has more built-in capabilities than Numpy arrays do, and these capabilities are geared towards Deep Learning applications (such as GPU acceleration), so it makes sense to prefer torch.Tensor
instances over regular Numpy arrays when working with PyTorch. Additionally, torch.Tensor
s have a very NumPy-like API, making it intuitive for most with prior experience!
In this guide, learn how to convert between a Numpy Array and PyTorch Tensors.
Convert Numpy Array to PyTorch Tensor
To convert a Numpy array to a PyTorch tensor - we have two distinct approaches we could take: using the from_numpy()
function, or by simply supplying the Numpy array to the torch.Tensor()
constructor or by using the tensor()
function:
import torch
import numpy as np
np_array = np.array([5, 7, 1, 2, 4, 4])
# Convert Numpy array to torch.Tensor
tensor_a = torch.from_numpy(np_array)
tensor_b = torch.Tensor(np_array)
tensor_c = torch.tensor(np_array)
So, what's the difference? The from_numpy()
and tensor()
functions are dtype
-aware! Since we've created a Numpy array of integers, the dtype
of the underlying elements will naturally be int32
:
print(np_array.dtype)
# dtype('int32')
If we were to print out our two tensors:
print(f'tensor_a: {tensor_a}\ntensor_b: {tensor_b}\ntensor_c: {tensor_c}')
tensor_a
and tensor_c
retain the data type used within the np_array
, cast into PyTorch's variant (torch.int32
), while tensor_b
automatically assigns the values to floats:
tensor_a: tensor([5, 7, 1, 2, 4, 4], dtype=torch.int32)
tensor_b: tensor([5., 7., 1., 2., 4., 4.])
tensor_c: tensor([5, 7, 1, 2, 4, 4], dtype=torch.int32)
This can also be observed through checking their dtype
fields:
print(tensor_a.dtype) # torch.int32
print(tensor_b.dtype) # torch.float32
print(tensor_c.dtype) # torch.int32
Numpy Array to PyTorch Tensor with dtype
These approaches also differ in whether you can explicitly set the desired dtype
when creating the tensor. from_numpy()
and Tensor()
don't accept a dtype
argument, while tensor()
does:
# Retains Numpy dtype
tensor_a = torch.from_numpy(np_array)
# Creates tensor with float32 dtype
tensor_b = torch.Tensor(np_array)
# Retains Numpy dtype OR creates tensor with specified dtype
tensor_c = torch.tensor(np_array, dtype=torch.int32)
print(tensor_a.dtype) # torch.int32
print(tensor_b.dtype) # torch.float32
print(tensor_c.dtype) # torch.int32
Naturally, you can cast any of them very easily, using the exact same syntax, allowing you to set the dtype
after the creation as well, so the acceptance of a dtype
argument isn't a limitation, but more of a convenience:
tensor_a = tensor_a.float()
tensor_b = tensor_b.float()
tensor_c = tensor_c.float()
print(tensor_a.dtype) # torch.float32
print(tensor_b.dtype) # torch.float32
print(tensor_c.dtype) # torch.float32
Convert PyTorch Tensor to Numpy Array
Converting a PyTorch Tensor to a Numpy array is straightforward, since tensors are ultimately built on top of Numpy arrays, and all we have to do is "expose" the underlying data structure.
Since PyTorch can optimize the calculations performed on data based on your hardware, there are a couple of caveats though:
tensor = torch.tensor([1, 2, 3, 4, 5])
np_a = tensor.numpy()
np_b = tensor.detach().numpy()
np_c = tensor.detach().cpu().numpy()
So, why use
detach()
andcpu()
before exposing the underlying data structure withnumpy()
, and when should you detach and transfer to a CPU?
CPU PyTorch Tensor -> CPU Numpy Array
If your tensor is on the CPU, where the new Numpy array will also be - it's fine to just expose the data structure:
np_a = tensor.numpy()
# array([1, 2, 3, 4, 5], dtype=int64)
This works very well, and you've got yourself a clean Numpy array.
CPU PyTorch Tensor with Gradients -> CPU Numpy Array
However, if your tensor requires you to calculate gradients for it as well (i.e. the requires_grad
argument is set to True
), this approach won't work anymore. You'll have to detach the underlying array from the tensor, and through detaching, you'll be pruning away the gradients:
tensor = torch.tensor([1, 2, 3, 4, 5], dtype=torch.float32, requires_grad=True)
np_a = tensor.numpy()
# RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.
np_b = tensor.detach().numpy()
# array([1., 2., 3., 4., 5.], dtype=float32)
GPU PyTorch Tensor -> CPU Numpy Array
Finally - if you've created your tensor on the GPU, it's worth remembering that regular Numpy arrays don't support GPU acceleration. They reside on the CPU! You'll have to transfer the tensor to a CPU, and then detach/expose the data structure.
Note: This can either be done via the to('cpu')
or cpu()
functions - they're functionally equivalent.
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
This has to be done explicitly, because if it were done automatically - the conversion between CPU and CUDA tensors to arrays would be different under the hood, which could lead to unexpected bugs down the line.
PyTorch is fairly explicit, so this sort of automatic conversion was purposefully avoided:
# Create tensor on the GPU
tensor = torch.tensor([1, 2, 3, 4, 5], dtype=torch.float32, requires_grad=True).cuda()
np_b = tensor.detach().numpy()
# TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
np_c = tensor.detach().cpu().numpy()
# array([1., 2., 3., 4., 5.], dtype=float32)
Note: It's highly advised to call detach()
before cpu()
, to prune away the gradients before transferring them to the CPU. The gradients won't matter anyway after the detach()
call - so copying them at any point is totally redundant and inefficient. It's better to "cut the dead weight" as soon as possible.
Generally speaking - this approach is the safest, as no matter which sort of tensor you're working - it won't fail. If you've got a CPU tensor, and you try sending it to the CPU - nothing happens. If you've got a tensor without gradients, and try detaching it - nothing happens. On the other end of the stick - exceptions are thrown.
Conclusion
In this guide - we've taken a look at what PyTorch tensors are, before diving into how to convert a Numpy array into a PyTorch tensor. Finally, we've explored how PyTorch tensors can expose the underlying Numpy array, and in which cases you'd have to perform additional transfers and pruning.