This post was last edited by Zhao_kar on 2024-5-7 22:10
"Deep Learning and Medical Image Processing" reading sharing - data preprocessing part
This section involves data preprocessing and data enhancement. This section mainly shares the data preprocessing part, and the data enhancement part is not shared.
First of all, this book mainly has the following points in data preprocessing :
- Interpolation
- Re-sampling
- Signal Strength Histogram
- Data Normalization
- Connected domain analysis and morphological methods
1. Interpolation
First, we need to understand the actual environment and application scenarios. For example, when rotating or enlarging an image, for example, if a 2*2 image is to be enlarged to a 4*4 image with 16 pixels, then we can use an interpolation algorithm to perform the operation. That is, when we perform the enlargement operation, unknown pixels will be generated, and the interpolation can be predicted by the original information.
Two common interpolation methods:
1. Nearest neighbor interpolation method
This is relatively simple, that is, the nearest input pixel is equal to the transformed pixel , so the calculation speed is very fast, but the disadvantage is also obvious, that is, the enlarged image has jagged edges.
2. Bilinear interpolation
There is a formula here, which can actually be deduced in combination with Figure a. The mathematical derivation is not difficult. In short, it is enough to understand that there is such a conclusion, and it can be upgraded to bilinear interpolation method. Correspondingly, the effect will be better, but the speed will be slower.
See the figure below for detailed effect comparison (you can see that figure a has obvious jagged edges)
2. Resampling
This involves a concept called voxel spacing. The scenario is: In fact, the actual size of human body parts is very important in medicine. Therefore, the difference between equipment and protocols will lead to different voxel spacing. Therefore, the voxel spacing must be resampled to ensure that the number of voxels can reflect the actual imaging size.
Here is a formula: Actual size = number of voxels * voxel spacing
Therefore, in order to ensure that the actual size remains unchanged, if the voxel spacing of the image is increased, the number will become smaller. At the same time, a problem will arise when the voxel spacing becomes smaller, so interpolation must be considered accordingly.
In short, the voxel spacing and number of voxels are changed to ensure the true size, which actually combines interpolation and resampling.
3. Histogram
To be honest, the histogram is relatively simple. It is a reflection of the signal strength. Here is a picture from the book for you to understand.
4. Data Normalization
I think this should be useful to most people who work on algorithms. For example, when doing signal processing , the collected waveform needs to be normalized so as to combine it with subsequent hardware and software.
In fact, the concept is to eliminate the dimension . There are two in this book:
1. Interval Normalization
In fact, it is to set the maximum value to 1, the minimum value to 0, and the remaining parameters except the maximum value, mapped to 0-1. This can be achieved by writing a function yourself
This method is sensitive to signal value but not to distribution, and is suitable for images with non-fixed intensity distribution , such as CT images where the meanings of different intensity values are fixed.
2. Z-score normalization
This is mapped to the standard normal distribution. This method has the opposite characteristics to the above method and is suitable for MR images.
5. Connected Domain Analysis and Morphological Methods
There is no detailed description in the book, only the concepts are introduced, so I will just give the concepts here.
1. A connected domain generally refers to an image region composed of pixels with the same pixel value and adjacent positions . Connected domain analysis refers to finding and marking each connected region in an image. Connected region analysis is very commonly used in many application fields of image analysis and processing. The object of connected region analysis processing is generally a binary image.
2. Image morphology, also known as mathematical morphology, refers to a series of image processing techniques that process image shape features. It is an image analysis discipline based on Glenn and topology, and is the basic theory of mathematical morphology image processing. Its basic idea is to use a special structural element to measure or extract the corresponding shape or feature in the input image for further image analysis and target recognition.