图像处理基础（四）边缘提取之 LOG 和 DOG 算子 (2024)

原理

Laplace

对连续二元标量函数 f(x,y) ，其拉普拉斯算子是梯度的散度，即

\begin{aligned} \nabla \cdot {\nabla{f}} = \frac{\partial^2{f}}{\partial{x^2}}+ \frac{\partial^2{f}}{\partial{y^2}}\end{aligned}

对于离散图像而言，

\begin{aligned} \nabla \cdot {\nabla{f}} = f(x + 1,y) + f(x - 1,y) + f(x, y + 1) + f(x, y - 1) - 4f(x,y) \end{aligned}

对应的滤波模板是

如果结果 < 0，说明滤波中心是局部极大值，可能是噪声点，也可能是边缘上的点；结果 = 0，说明滤波中心很可能处于平坦区域；结果 > 0，说明滤波中心是局部极小值，可能是噪声点，也可能是边缘。

很明显，laplace 可以用于定位局部极值，用于寻找边缘

cv::Mat laplace_extract_edges(const cv::Mat& source) { // 获取信息 const int H = source.rows; const int W = source.cols; // padding const auto padded_image = make_pad(source, 1, 1); const int W2 = padded_image.cols; // 准备结果 auto result = source.clone(); // 开始滤波 for(int i = 0;i < H; ++i) { const uchar* row_ptr = padded_image.data + (1 + i) * W2 + 1; uchar* const res_ptr = result.data + i * W; for(int j = 0;j < W; ++j) { // 每个点, 找上下左右四个点 const uchar u = row_ptr[j - W2], d = row_ptr[j + W2], l = row_ptr[j - 1], r = row_ptr[j + 1]; res_ptr[j] = cv::saturate_cast<uchar>(std::abs(u + d + l + r - 4 * row_ptr[j])); } } return result;}

但是 laplace 对于噪声点也能检测出来，所以给上面的图像加一些高斯噪声，边缘检测结果就会差得多

LOG

为了解决上面的问题，一个很简单的想法就是先对图像做高斯平滑，去除一些噪声，再使用 laplace 处理图像。这两步都可以是卷积的方式。连续的卷积，可以先求高斯平滑函数的 laplace，然后再对图像做滤波，证明略。

因此，接下来求高斯平滑函数的 laplace 算子

二元连续的高斯函数 G(x,y,\sigma) = \frac{1}{2 \pi \sigma^2} e^{-\frac{x^2 + y^2}{2\sigma^2}} ，求 laplace, 梯度的散度，需要求两次偏导

\frac{\partial{G}}{\partial{x}} = \frac{1}{2\pi \sigma^2} \cdot -\frac{1}{2 \sigma^2}2x \cdot e^{-\frac{x^2 + y^2}{2\sigma^2}} = -\frac{x}{2\pi \sigma^4}\cdot e^{-\frac{x^2 + y^2}{2\sigma^2}}

\begin{aligned} \frac{\partial^2{G}}{\partial{x}^2} &= -\Big( \frac{1}{2\pi \sigma^4}\cdot e^{-\frac{x^2 + y^2}{2\sigma^2}} -\frac{2x}{2\sigma^2} \cdot e^{-\frac{x^2 + y^2}{2\sigma^2}} \cdot \frac{x}{2\pi\sigma^4} \Big) \\ &= - \frac{1}{2\pi \sigma^4}\cdot e^{-\frac{x^2 + y^2}{2\sigma^2}} + \frac{x^2}{2\pi\sigma^6} \cdot e^{-\frac{x^2 + y^2}{2\sigma^2}} \\ &= \frac{x^2 - \sigma^2}{2\pi \sigma^6} \cdot e^{-\frac{x^2 + y^2}{2\sigma^2}} \end{aligned}

由于 x,y 在函数中是对称的，因此，很容易可以得到

\frac{\partial{G}}{\partial{y}} = -\frac{y}{2\pi \sigma^4}\cdot e^{-\frac{x^2 + y^2}{2\sigma^2}}

\begin{aligned} \frac{\partial^2{G}}{\partial{y}^2} &= \frac{y^2 - \sigma^2}{2\pi \sigma^6} \cdot e^{-\frac{x^2 + y^2}{2\sigma^2}} \end{aligned}

由此可得，高斯函数的 laplace ，LOG

\begin{aligned} \nabla \cdot {\nabla{G(x,y,\sigma)}} = \frac{\partial^2{G}}{\partial{x^2}}+ \frac{\partial^2{G}}{\partial{y^2}}\end{aligned} = \frac{x^2 + y^2 - 2\sigma^2}{2\pi \sigma^6} \cdot e^{-\frac{x^2 + y^2}{2\sigma^2}}

5 \times 5 的 LOG 模板如下

在图像处理中，平坦区域的 laplace 为 0，因为没有梯度。上面的 5 \times 5 模板是离散的，和不为 0，因此需要略微修改，简单的做法是统一减去一个微小量，使得和为 0。为了加快处理速度，也可以全转成 int，但会损失一些精度。

上图是二元连续 LOG 函数，可以看到，从滤波中心往外扩散，有一个先增大后减小的过程。对图像滤波时，分三种情况

平坦区域

因为没有梯度，因此不管怎么加权相加减，LOG 模板的响应值为 0

边缘

LOG 模板和图像无关，考虑下图灰度剧烈变化的区域

黑色线表示灰度值，从一个较低的平坦区域到一个较高的平坦区域，两侧的平坦区域处（蓝色）的 LOG 响应值都为 0；灰度跃升时，考虑红色部分，刚开始由于右边加入了一些高区域（绿色）的灰度，所以，LOG 响应值 > 0。

继续向右移动，由于还剩下一部分在低灰度（绿色），此时，LOG 响应值 < 0。

从 > 0 渐变到 < 0，LOG 模板在从低灰度区域到高灰度区域移动的过程中，存在某个时刻的响应值 = 0，但由于是离散的，这个点不一定是图像中的点，粗略检测边缘的时候，可以检查两侧 LOG 响应值符号是否相反，如果一正一负，说明中间存在灰度剧烈变化的点，可以看作边缘。

一正一负，在边缘的两侧，靠近低灰度的区域 LOG 响应值 > 0，靠近高灰度的区域 LOG 区域响应值 < 0。

噪声点

如果 LOG 遇到的是噪声点，考虑上面的情况，如果噪声处在 LOG 模板 < 0的区域（绿色），那么 LOG 响应值一定 < 0，因为突然减去了一个高灰度值，而其它平坦区域不变。

如果考虑噪声处在 LOG 模板 > 0 的区域，那么 LOG 响应值是 > 0 的，同时考虑另一侧的情况

也是一样的，平坦区域为 0，相比之下加入了一个高灰度值，LOG 滤波结果 > 0，因此 LOG 对于噪声的响应值变化如下图

两侧的平坦区域，LOG 响应值 = 0；噪声处于 LOG > 0 区域时，LOG 响应值 > 0；噪声处于 LOG < 0 区域时，LOG 响应值 < 0。

结论是：按照之前边缘检测的方法，找两侧一正一负，LOG 依旧无法避免噪声。。？？？

综上，可以得到 LOG 检测边缘的具体步骤

根据方差和滤波核大小确定 LOG 模板
LOG 模板对图像做滤波
在滤波结果中，找两侧响应值分别一正一负的点
设置一个阈值 t，两侧像素之差 > t 的点被视作真正的边缘

具体实现时，考虑以下 4 种一正一负的情况

代码

std::vector<double> laplace_of_gaussi(const cv::Mat& source, const int radius, const double sigma) { // padding 处理边缘 const auto padded_image = make_pad(source, radius, radius); const int W2 = padded_image.cols; // 准备一个 LOG 模板 const int window_len = (radius << 1) + 1; const int window_size = window_len * window_len; const double sigma_2 = sigma * sigma; const double sigma_6 = sigma_2 * sigma_2 * sigma_2; double LOG[window_size]; int LOG_offset[window_size]; int offset = 0; for(int i = -radius; i <= radius; ++i) { for(int j = -radius; j <= radius; ++j) { const double distance = i * i + j * j; LOG[offset] = (distance - 2 * sigma_2) / sigma_6 * std::exp(-distance / (2 * sigma_2)); LOG_offset[offset] = i * W2 + j; ++offset; } } /* 0.001 0.011 0.044 0.067 0.044 0.011 0.001 0.011 0.100 0.272 0.326 0.272 0.100 0.011 0.044 0.272 0.088 -0.629 0.088 0.272 0.044 0.067 0.326 -0.629 -2.460 -0.629 0.326 0.067 0.043 0.272 0.088 -0.629 0.088 0.272 0.044 0.011 0.100 0.272 0.326 0.272 0.100 0.011 0.001 0.011 0.044 0.067 0.044 0.011 0.001 */ // 平坦区域, LOG 响应为 0 double sum_value = 0.0; for(int i = 0;i < offset; ++i) sum_value += LOG[i]; for(int i = 0;i < offset; ++i) LOG[i] -= sum_value / offset; // 收集原始图像信息 const int H = source.rows; const int W = source.cols; const int length = H * W; // LOG 模板扫过 std::vector<double> LOG_result(length, 0); for(int i = 0;i < H; ++i) { const uchar* const row_ptr = padded_image.data + (radius + i) * W2 + radius; double* const res_ptr = LOG_result.data() + i * W; for(int j = 0;j < W; ++j) { double conv_sum = 0; for(int k = 0;k < offset; ++k) conv_sum += LOG[k] * row_ptr[j + LOG_offset[k]]; res_ptr[j] = conv_sum; } } return LOG_result;}cv::Mat laplace_of_gaussi_edge_detection(const cv::Mat& source, const int radius, const double sigma, const double threshold) { // 获取 LOG 滤波的结果 const auto LOG_result = laplace_of_gaussi(source, radius, sigma); const int H = source.rows; const int W = source.cols; // 准备结果 auto result_image = source.clone(); // 现在 LOG_result 里面是结果, 开始找两侧一正一负的点 for(int i = 1;i < H - 1; ++i) { const double* const row_ptr = LOG_result.data() + i * W; uchar* const res_ptr = result_image.data + i * W; for(int j = 1;j < W - 1; ++j) { // 开始找四个方向 if((row_ptr[j - W] * row_ptr[j + W] < 0 and std::abs(row_ptr[j - W] - row_ptr[j + W]) > threshold) or (row_ptr[j - 1] * row_ptr[j + 1] < 0 and std::abs(row_ptr[j - 1] - row_ptr[j + 1]) > threshold) or (row_ptr[j - W - 1] * row_ptr[j + W + 1] < 0 and std::abs(row_ptr[j - W - 1] - row_ptr[j + W + 1]) > threshold) or (row_ptr[j - W + 1] * row_ptr[j + W - 1] < 0 and std::abs(row_ptr[j - W + 1] - row_ptr[j + W - 1]) > threshold)) { res_ptr[j] = 255; } else res_ptr[j] = 0; } } return result_image;}

对于之前的噪声图像

虽然无法完全去除噪声，但是和之前直接用 laplace 的效果相比，还是减弱了一些

DOG

二元连续高斯函数 G(x,y,\sigma) = \frac{1}{2\pi }\sigma^{-2} e^{-\frac{x^2 + y^2}{2} \sigma^{-2}} 对方差 \sigma 求导，可以得到

\begin{aligned} \frac{\partial{G}}{\partial{\sigma}} &= \frac{1}{2\pi} \cdot (-2) \cdot \sigma^{-3} \cdot e^{-\frac{x^2 + y^2}{2} \sigma^{-2}} -\frac{x^2 + y^2}{2} \cdot (-2) \cdot \sigma^{-3} \cdot e^{-\frac{x^2 + y^2}{2} \sigma^{-2}} \cdot \frac{1}{2\pi}\sigma^{-2} \\ &= \Big( -\frac{1}{\pi\sigma^{3}} + \frac{x^2 + y^2}{2\pi\sigma^5} \Big) \cdot e^{-\frac{x^2 + y^2}{2} \sigma^{-2}} \\ &= \frac{x^2 + y^2 - 2\sigma^2}{2\pi\sigma^5} \cdot e^{-\frac{x^2 + y^2}{2\sigma^{2}} } \\ \end{aligned}

很巧地，前面得到的

\begin{aligned} LOG(x,y,\sigma) = \frac{\partial^2{G}}{\partial{x^2}}+ \frac{\partial^2{G}}{\partial{y^2}}\end{aligned} = \frac{x^2 + y^2 - 2\sigma^2}{2\pi \sigma^6} \cdot e^{-\frac{x^2 + y^2}{2\sigma^2}} ,

二者就差一个 \sigma ，即

\frac{\partial{G}}{\partial{\sigma}} = \sigma \cdot LOG

根据极限，又可以得到

\begin{aligned} \frac{\partial{G}}{\partial{\sigma}} &= \lim_{\Delta\sigma\rightarrow0}{\frac{G(x, y, \sigma + \Delta\sigma) - G(x, y, \sigma)}{\Delta\sigma}}\\ &= \lim_{k\rightarrow1}{\frac{G(x, y, k \sigma ) - G(x, y, \sigma)}{(k - 1)\cdot \sigma}}\\ \end{aligned}

结合前面两式，当 k \approx 1 ，有

\begin{aligned} \frac{G(x, y, k \sigma ) - G(x, y, \sigma)}{(k - 1)\cdot \sigma} &\approx \sigma \cdot LOG \\ G(x, y, k \sigma ) - G(x, y, \sigma) &\approx (k - 1)\cdot \sigma^2\cdot LOG \end{aligned}

两个不同方差的高斯函数相减，记作 DOG 即可近似得到 LOG ，可见，当方差确定时，二者的函数曲线增减趋势是十分相近的，如下图

可以认为，DOG 近似 LOG。

边缘检测时，二者相比

DOG 只需要两个高斯模板分别对图像滤波，然后相减，计算上可以加速，例如分离成两个一维的模板
LOG 更为精确。

#include "faster_gaussi_filter.h"std::vector<double> difference_of_gaussi(const cv::Mat& source, const int radius, const double sigma, const double k) { // 两个高斯卷积 const auto lhs = faster_2_gaussi_filter_channel(source, 2 * radius + 1, k * sigma, k * sigma); const auto rhs = faster_2_gaussi_filter_channel(source, 2 * radius + 1, sigma, sigma); // 准备结果 const int length = source.rows * source.cols; std::vector<double> result_double(length, 0); for(int i = 0;i < length; ++i) result_double[i] = lhs.data[i] - rhs.data[i]; // return result_double;}cv::Mat laplace_of_gaussi_edge_detection(const cv::Mat& source, const int radius, const double sigma, const double threshold) { // 获取 LOG 结果 // const auto LOG_result = laplace_of_gaussi(source, radius, sigma); const auto LOG_result = difference_of_gaussi(source, radius, sigma, 1.1); // ...... 定位过零的点作为边缘}

极值点检测

DOG 比较著名的应用就是在 sift 中定位特征点。每个金字塔都包含不同尺度的高斯模糊之后的图像，下图是某个金字塔（某个分辨率）的 5 个尺度高斯模糊的图像组，相减得到 4 个 DOG 高斯差分图像，

之后，相邻三层 DOG 之间，寻找中间那一层的极值点，如下

比较同一个 DOG 以及上下相邻的 DOG，在邻域 3x3 内共 26 个点，判定是否为极大值或者极小值，如果在多个尺度下都能保证是极值，则可以认为是稳定的特征点，以下是一个粗略的实现

// 单通道关键点检测std::pair< keypoints_type, keypoints_type > difference_of_gaussi_keypoints_detection( const cv::Mat& source, const int radius, const std::vector< double > sigma_list, const double threshold) { const int C = source.channels(); if(C != 1) { std::cout << "只支持单通道图像 !" << std::endl; return std::pair< keypoints_type, keypoints_type >(); } // 四次高斯模糊, 3 张 DOG 图 assert(sigma_list.size() == 4); // 根据 sigma_list 的方差分别做高斯模糊 const int kernel_size = (radius << 1) + 1; const auto gaussi_1 = faster_2_gaussi_filter_channel(source, kernel_size, sigma_list[0], sigma_list[0]); const auto gaussi_2 = faster_2_gaussi_filter_channel(source, kernel_size, sigma_list[1], sigma_list[1]); const auto gaussi_3 = faster_2_gaussi_filter_channel(source, kernel_size, sigma_list[2], sigma_list[2]); const auto gaussi_4 = faster_2_gaussi_filter_channel(source, kernel_size, sigma_list[3], sigma_list[3]); // 分别求 DOG const int H = source.rows; const int W = source.cols; const int length = H * W; std::vector<double> DOG_down(length), DOG_mid(length), DOG_up(length); for(int i = 0;i < length; ++i) DOG_down[i] = gaussi_1.data[i] - gaussi_2.data[i]; for(int i = 0;i < length; ++i) DOG_mid[i] = gaussi_2.data[i] - gaussi_3.data[i]; for(int i = 0;i < length; ++i) DOG_up[i] = gaussi_3.data[i] - gaussi_4.data[i]; // 准备结果, 返回极大值与极小值的点 keypoints_type max_points; keypoints_type min_points; // 三层之间找极值 for(int i = 1;i < H - 1; ++i) { double* const down = DOG_down.data() + i * W; double* const mid = DOG_mid.data() + i * W; double* const up = DOG_up.data() + i * W; for(int j = 1;j < W - 1; ++j) { // 中间这个点的值, 和最近的 26 个点比较大小 const auto center = mid[j]; // 极大值 if(center > mid[j - 1] and center > mid[j + 1] and center > mid[j - 1 - W] and center > mid[j - W] and center > mid[j + 1 - W] and center > mid[j - 1 + W] and center > mid[j + W] and center > mid[j + 1 + W] and center > down[j - 1] and center > down[j] and center > down[j + 1] and center > down[j - 1 - W] and center > down[j - W] and center > down[j + 1 - W] and center > down[j - 1 + W] and center > down[j + W] and center > down[j + 1 + W] and center > up[j - 1] and center > up[j] and center > up[j + 1] and center > up[j - 1 - W] and center > up[j - W] and center > up[j + 1 - W] and center > up[j - 1 + W] and center > up[j + W] and center > up[j + 1 + W]) { //保留响应值比较强的点作为特征点 if(std::abs(center) > threshold) max_points.emplace_back(i, j); } // 极小值 if(center < mid[j - 1] and center < mid[j + 1] and center < mid[j - 1 - W] and center < mid[j - W] and center < mid[j + 1 - W] and center < mid[j - 1 + W] and center < mid[j + W] and center < mid[j + 1 + W] and center < down[j - 1] and center < down[j] and center < down[j + 1] and center < down[j - 1 - W] and center < down[j - W] and center < down[j + 1 - W] and center < down[j - 1 + W] and center < down[j + W] and center < down[j + 1 + W] and center < up[j - 1] and center < up[j] and center < up[j + 1] and center < up[j - 1 - W] and center < up[j - W] and center < up[j + 1 - W] and center < up[j - 1 + W] and center < up[j + W] and center < up[j + 1 + W]) { if(std::abs(center) > threshold) min_points.emplace_back(i, j); } } } return std::make_pair(max_points, min_points);}void demo_5() { // 读取图像 const std::string image_path("../images/detail/a0015-DSC_0081.png"); auto origin_image = cv::imread(image_path); cv::Mat detected_result = origin_image.clone(); // BGR -> Gray 支持灰度图像 cv::cvtColor(origin_image, origin_image, cv::COLOR_BGR2GRAY); // LOG 检测边缘 std::pair<keypoints_type, keypoints_type > key_points; key_points = difference_of_gaussi_keypoints_detection(origin_image, 3, {0.3, 0.4, 0.5, 0.6}, 6); // 极大值 for(const auto point : key_points.first) cv::circle(detected_result, cv::Point(point.second, point.first), 3, CV_RGB(255, 0, 0)); // 极小值 for(const auto point : key_points.second) cv::circle(detected_result, cv::Point(point.second, point.first), 3, CV_RGB(0, 255, 0)); // 展示结果, 保存 auto comparison_results = cv_concat({detected_result}); cv_show(comparison_results); cv_write(comparison_results, "./images/output/LOG_keypoints_detection.png");}

结果

看起来效果还可以。

另外测试几张图像

后续将在 sift 中优化。

问题

LOG 检测边缘时，寻找两侧响应值相反的时候，是离散的点，太过粗略

改进

double 改成 int，损失一些精度，加快计算
LOG 模板也是对称的，而且可以分离成两个一维的 LOG 加快计算

参考资料

https://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm
https://homepages.inf.ed.ac.uk/rbf/HIPR2/zeros.htm
https://senitco.github.io/2017/06/20/image-feature-LoG-DoG/
https://www.cnblogs.com/YiXiaoZhou/p/5891645.html
https://blog.csdn.net/gnehcuoz/article/details/52793654
https://blog.csdn.net/u014453898/article/details/106461500
SIFT - Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60(2): 91-110.
测试图片来源 - MIT-Adobe-5K

图像处理基础 （四）边缘提取之 LOG 和 DOG 算子 (2024)

原理

Laplace

LOG

DOG

极值点检测

问题

改进

参考资料

图像处理基础（四）边缘提取之 LOG 和 DOG 算子 (2024)