How Dicomparser Can Read Dcm Files in a Folder Python Code
I'll be showing how to use the pydicom package and/or VTK to read a series of DICOM images into a NumPy array. This will involve reading metadata from the DICOM files and the pixel-information itself.
Introduction: The DICOM standard
Anyone in the medical image processing or diagnostic imaging field, will have undoubtedly dealt with the infamous Digital Imaging and Communications in Medicine (DICOM) standard the de-facto solution to storing and exchanging medical prototype-data.
Applications such every bit RadiAnt or MicroDicom for Windows and OsiriX for Mac, do a great job of dealing with DICOM files. However, in that location are every bit many flavors of DICOM equally there are of ice-cream. Thus, when it comes to programmatically reading and processing DICOM files things get a little hairier depending on whether the files store the pixel data in a compressed form or not.
In this mail service I will show how to read in uncompressed DICOM file series through either the pydicom package or VTK. Take a look at the Resources section for tips on how to tackle compressed DICOM files.
DICOM Datasets
There's a wealth of freely available DICOM datasets online but here's a few that should go you started:
- Osirix Datasets: This is my personal favorite as it provides a big range of human datasets acquired through a variety of imaging modalities.
- Visible Human being Datasets: Parts of the Visible Human project are somehow freely distributed hither which is weird crusade getting that data is neither free nor hassle-free.
- The Zubal Phantom: This website offers multiple datasets of two homo males in CT and MRI which are freely distributed.
Despite the fact that its easy to get DICOM datasets, about databases forbid their re-distribution past third parties. Therefore, for the purposes of these posts I decided to use a dataset from an MR test of my own fatty head. Y'all tin can find a .zilch file with with .dcm files on the sagittal aeroplane hither. In order to follow this post extract the contents of this file alongside the IPython Notebooks, the contents of which I'll be presenting in the pydicom Usage section and the VTK Usage department.
The pydicom package
In this, first of two posts I will show how to apply the pydicom packet, which consists of pure-python code, is hosted on pypi, and can be easily installed through pip as such:
pip install pydicom As is often the example with many Python packages, while this package is called
pydicomit simply goes bydicomwithin Python and needs to be imported withimport dicom.
Usage
In this case I'grand gonna employ the MR dataset of my own head, discussed in the DICOM Datasets section, and the pydicom bundle, to load the entire series of DICOM data into a 3D NumPy array and visualize 2 slices through matplotlib. You tin find the entire IPython Notebook here.
Obviously we'll kickoff with importing the packages we'll need:
import dicom import os import numpy from matplotlib import pyplot, cm The only point of interest here is, every bit I mentioned in the The pydicom package department, that the pydicom packet is imported as dicom so be careful with that. Next nosotros use os.path.walk to traverse the MyHead directory, and collect all .dcm files into a list named lstFilesDCM:
PathDicom = "./MyHead/" lstFilesDCM = [] # create an empty list for dirName, subdirList, fileList in os.walk(PathDicom): for filename in fileList: if ".dcm" in filename.lower(): # check whether the file'southward DICOM lstFilesDCM.append(os.path.join(dirName,filename)) If y'all bank check the
MyHeadfolder you'll see that the.dcmfiles are namedMR000000.dcm,MR000001.dcm, etc. Therefore, thewalkpart will render them in society since they're sorted lexicographically by the OS. Still, in many cases DICOM files don't accept all those leading zeros, with names likeMR1,MR2, etc which would upshot inlstFilesDCMhaving the filenames ordered in a fashion such asMR1,MR10,MR100, etc. Since, the typical sorting functions in python, such assortedand thesortmethod oflistingobjects (docs here), are lexicographical as well (unless dealing with pure numbers), I strongly suggest using the very usefulnatsortbundle which can be found on PyPI here (and tin be installed with a simplepip install natsort).
Now, lets get into the pydicom part of the code. A notable aspect of this package is that upon reading a DICOM file, information technology creates a dicom.dataset.FileDataset object where the different metadata are assigned to object attributes with the same name. We'll run into this below:
# Become ref file RefDs = dicom.read_file(lstFilesDCM[0]) # Load dimensions based on the number of rows, columns, and slices (along the Z axis) ConstPixelDims = (int(RefDs.Rows), int(RefDs.Columns), len(lstFilesDCM)) # Load spacing values (in mm) ConstPixelSpacing = (float(RefDs.PixelSpacing[0]), float(RefDs.PixelSpacing[1]), bladder(RefDs.SliceThickness)) In the first line we load the 1st DICOM file, which we're gonna use as a reference named RefDs, to extract metadata and whose filename is showtime in the lstFilesDCM list. We and so calculate the total dimensions of the 3D NumPy assortment which are equal to (Number of pixel rows in a piece) x (Number of pixel columns in a piece) x (Number of slices) along the 10, y, and z cartesian axes. Lastly, we use the PixelSpacing and SliceThickness attributes to summate the spacing between pixels in the three axes. We store the array dimensions in ConstPixelDims and the spacing in ConstPixelSpacing.
If you were to open one of the DICOM files with an application such as the ones mentioned in the Intro section and checked the metadata you'd run across that
Rows,Columns,PixelSpacing, andSliceThicknessare all metadata entries.pydicomsimply creates attributes with the same names and assigns appropriate values to those, making them easily accessible.
The adjacent chunk of code is:
x = numpy.arange(0.0, (ConstPixelDims[0]+one)*ConstPixelSpacing[0], ConstPixelSpacing[0]) y = numpy.arange(0.0, (ConstPixelDims[ane]+1)*ConstPixelSpacing[1], ConstPixelSpacing[1]) z = numpy.arange(0.0, (ConstPixelDims[2]+1)*ConstPixelSpacing[ii], ConstPixelSpacing[2]) where we simply use numpy.arange, ConstPixelDims, and ConstPixelSpacing to calculate axes for this array. Side by side, comes the last pydicom part:
# The array is sized based on 'ConstPixelDims' ArrayDicom = numpy.zeros(ConstPixelDims, dtype=RefDs.pixel_array.dtype) # loop through all the DICOM files for filenameDCM in lstFilesDCM: # read the file ds = dicom.read_file(filenameDCM) # store the raw image information ArrayDicom[:, :, lstFilesDCM.index(filenameDCM)] = ds.pixel_array As you tin can see, what we do here is first create a NumPy array named ArrayDicom with the dimensions specified in ConstPixelDims calculated earlier. The dtype of this array is the same every bit the dtype of the pixel_array of the reference-dataset RefDs which we originally used to extract metadata. The point of interest here is that the pixel_array object is a pure NumPy array containing the pixel-data for the particular DICOM slice/image. Therefore, what nosotros exercise next is loop through the collected DICOM filenames and utilise the dicom.read_file function to read each file into a dicom.dataset.FileDataset object. We and so use the pixel_array attribute of that object, and toss it into ArrayDicom, stacking them along the z axis.
And that's it! Using the pyplot module in matplotlib we tin can create a overnice lil' plot as such:
pyplot.figure(dpi=300) pyplot.axes().set_aspect('equal', 'datalim') pyplot.set_cmap(pyplot.gray()) pyplot.pcolormesh(x, y, numpy.flipud(ArrayDicom[:, :, 80])) which results in the following prototype:
Reading DICOM through VTK
Now while skimming the previous department you might take thought 'pfff that's fashion likewise like shooting fish in a barrel, why did we bother reading your rants?'. Well, in the interest of keeping yous interested, I decided – against my meliorate judgement – to provide the VTK approach to the above process.
Usage
You can find a separate notebook hither, while I'll be using the same dataset. Make certain to cheque that notebook cause I'll only exist detailing the VTK parts of the code here. You will detect that the VTK solution is quite a bit more succinct. Now let's start with reading in the series of .dcm files:
PathDicom = "./MyHead/" reader = vtk.vtkDICOMImageReader() reader.SetDirectoryName(PathDicom) reader.Update() As you can see, unlike the approach in the previous department, here we saved ourselves the ii loops, namely populating the filename list, and reading in the information piece-past-slice. We first create a vtkDICOMImageReader object (docs here), and pass the path to the directory where all the .dcm files are through the SetDirectoryName method. After that, its just a matter of calling the Update method which does all the reading. If you were dealing with huge datasets, you'd exist surprised how much faster than pydicom VTK does that.
Don't let the above approach fool you entirely. It worked only cause the Os sorted the files correctly by itself. Every bit I mentioned in the previous section, if the files weren't properly named, lexicographical sorting would have given you a messed up array. In that instance you would either need to loop and pass each file to a separate
readerthrough theSetFileNamemethod, or you'd have to create avtkStringArray, push the sorted filenames, and use thevtkDICOMImageReader.SetFileNamesmethod. Continue your eyes open! VTK is non forgiving 🙂
Adjacent nosotros demand admission to the metadata in society to calculate those ConstPixelDims and ConstPixelSpacing variables:
# Load dimensions using `GetDataExtent` _extent = reader.GetDataExtent() ConstPixelDims = [_extent[i]-_extent[0]+1, _extent[3]-_extent[ii]+1, _extent[five]-_extent[iv]+1] # Load spacing values ConstPixelSpacing = reader.GetPixelSpacing() Every bit yous can encounter, the vtkDICOMImageReader form comes with a few useful methods that provide the metadata directly-up. However, just a few of those values are available in a straightforward manner. Thankfully, by using the GetDataExtent method of the reader we get a half-dozen-value tuple with the starting and stopping indices of the resulting array on all three axes. Plain, that's all nosotros need to calculate the size of the assortment. Getting the pixel-spacing is even easier than with pydicom and simply accomplished with the GetPixelSpacing method.
Now, onto the fun office :). You might accept read my previous mail on how to convert arrays between NumPy and VTK. You might have thought that we can use that functionality and go a nice NumPy array with a one-liner. Well, I detest to disappoint you but information technology's not that straightforward (remember, VTK).
If you lot dig a little into the vtkDICOMImageReader docs, you volition see that it inherits vtkImageReader2, which in turn inherits vtkImageAlgorithm. The latter, sports a GetOutput method returning a vtkImageData object pointer. Even so, the numpy_support.vtk_to_numpy function only works on vtkArray objects so we demand to dig into the vtkImageData object till nosotros get that blazon of assortment. Here's how we do that:
# Become the 'vtkImageData' object from the reader imageData = reader.GetOutput() # Go the 'vtkPointData' object from the 'vtkImageData' object pointData = imageData.GetPointData() # Ensure that only one array exists within the 'vtkPointData' object assert (pointData.GetNumberOfArrays()==i) # Become the `vtkArray` (or whatsoever derived blazon) which is needed for the `numpy_support.vtk_to_numpy` office arrayData = pointData.GetArray(0) # Convert the `vtkArray` to a NumPy array ArrayDicom = numpy_support.vtk_to_numpy(arrayData) # Reshape the NumPy array to 3D using 'ConstPixelDims' equally a 'shape' ArrayDicom = ArrayDicom.reshape(ConstPixelDims, society='F') As you lot can see, we initially employ reader.GetOutput() to go a vtkImageData object pointer into imageData. We so apply the GetPointData method of that object to 'excerpt' a vtkPointData object pointer into pointData. Now, these vtkPointData may hold multiple arrays simply we should just take i in at that place (existence the entirety of the DICOM data). Since these 'internal' arrays are numerically indexed, we get this vtkArray through index 0 and then the array nosotros were looking for can exist retrieved through arrayData = pointData.GetArray(0). We tin can now finally 'convert' that pesky assortment to NumPy through vtk_to_numpy and store it in ArrayDicom. As a last step, we reshape that assortment using ConstPixelDims et voila!
From that point, nosotros use our lovely NumPy array and get the same plots we got with the previous approach.
Notation that we reshape
ArrayDicomwith a 'Fortran' club. Don't ask me why, but when I tried toreshapeinCsocial club I got misaligned rubbish then there. Trial-north-error.
Resource
Should you desire to learn more than about pydicom do check the project's official website, its wiki, and its user guide.
'Whoa' you might say though, what about those JPEG-based compressed DICOM files you mentioned in the intro?. Well unfortunately, neither pydicom, nor the admittedly convenient vtkDICOMImageReader course tin handle those. At least, the pydicom packet volition warn you and render a NotImplementedError upon reading such a DICOM file, while VTK will just return an array full of 0s and leave you wondering. At this point I can just remember of 2 viable solutions to this:
- Get the hardcore road, install the GDCM library with Python bindings on your organization, and utilize the
mudicompackage to handle it in Python. GDCM is a breeze to install on Windows and Linux systems equally information technology provides pre-compiled binaries for those. You should as well be able to install it on Mac using Homebrew and this recipe but I haven't tested information technology yet. - Go the cheater's route, or as I similar to phone call it the lazy engineer's route. Merely open that data in one of the applications mentioned in the Intro, such as OsiriX, and save it every bit uncompressed DICOM (yes you can do that).
Anyway, enough for today, hope yous've learned plenty to start butchering medical epitome data on your own, as at that place are a meg crawly things you tin practise with those. We'll run across a few bang-up things in subsequent posts. Next up, I'yard going to regale you with bone-chilling tales of marching-cubes and surface extraction :). Thank you for reading!
vickeryhiciandold.blogspot.com
Source: https://pyscience.wordpress.com/2014/09/08/dicom-in-python-importing-medical-image-data-into-numpy-with-pydicom-and-vtk/
0 Response to "How Dicomparser Can Read Dcm Files in a Folder Python Code"
Post a Comment