Skip to content

Commit

Permalink
update the page
Browse files Browse the repository at this point in the history
  • Loading branch information
fundamentalsystem committed Sep 19, 2023
1 parent bfe1981 commit 810bd1e
Showing 1 changed file with 81 additions and 205 deletions.
286 changes: 81 additions & 205 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,237 +1,113 @@

<h1 style="text-align: center; font-size: 36px; font-family: 'Sama Devanagari'">
CUBIT: A High-resolution Infrastructure Defect Dataset <br> Fully Evaluated with
Autonomous Detection Framework
</h1>
<h2>
<div
style="text-align: center; font-size: 20px; font-family: 'Sama Devanagari'"
>
Submitted to International Conference on Acoustics, Speech, & Signal
Processing 2024 (ICASSP 2024)
</div>
</h2>
<div style="text-align: center; font-size: 17px">
Benyun Zhao<sup>1</sup>, Xunkuai Zhou<sup>2</sup>, Guidong Yang<sup>1</sup>,
Junjie Wen<sup>1</sup>, Jihan Zhang<sup>1</sup>, Xi Chen<sup>1</sup>, and
<a href="http://www.mae.cuhk.edu.hk/~bmchen/">Ben M. Chen</a><sup>1</sup>,
IEEE Fellow
<h1 style="text-align: center; font-size: 36px; font-family: 'Baskerville';"> CUBIT: A High-resolution Infrastructure Defect Dataset <br> Fully Evaluated with Autonomous Detection Framework
<div style="text-align: center; font-size: 20px; font-family: 'Georgia';">
Submitted to International Conference on Acoustics, Speech, & Signal Processing 2024 (ICASSP 2024)
</div>
</h1>

<div style="text-align: center; font-size: 17px">
1.Department of Mechanical and Automation Engineering, The Chinese University
of Hong Kong <br />
2.School of Electronics and Information Engineering,Tongji University
<div style=" text-align: center; font-size: 17px;">
Benyun Zhao<sup>1</sup>, Xunkuai Zhou<sup>2</sup>, Guidong Yang<sup>1</sup>, Junjie Wen<sup>1</sup>, Jihan Zhang<sup>1</sup>, Xi Chen<sup>1</sup>, and <a href="http://www.mae.cuhk.edu.hk/~bmchen/">Ben M. Chen</a><sup>1</sup>, IEEE Fellow
</div>
<div
style="
display: flex;
flex-direction: row;
margin: 10px auto;
justify-content: center;
"
>
<button
style="
background-color: #000000;
color: white;
margin-right: 15px;
padding: 10px 15px;
border: none;
border-radius: 5px;
"
disabled
>
<a
href="https://www.overleaf.com/"
style="color: white; text-decoration: none"
>Paper</a
>
</button>

<button
style="
background-color: #000000;
color: white;
margin-right: 15px;
padding: 10px 15px;
border: none;
border-radius: 5px;
"
disabled
>
<a
href="https://github.com/ZHAOBenyun/CUBIT"
style="color: white; text-decoration: none"
>Dataset</a
>
</button>

<button
style="
background-color: #000000;
color: white;
margin-right: 15px;
padding: 10px 15px;
border: none;
border-radius: 5px;
"
>
<a
href="./ICASSP_2024_Appendix.pdf"
style="color: white; text-decoration: none"
>Appendix</a
>
</button>

<div style="text-align: center; font-size: 17px;" >
1.Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong <br /> 2.School of Electronics and Information Engineering,Tongji University

</div>
<div style="display: flex; flex-direction: row; margin: 10px auto; justify-content: center">

<button style="background-color: #c3bebe; color: white;margin-right: 15px; padding: 10px 15px;border: none; border-radius: 5px;">
<a href="" style="color: white; text-decoration: none;">Paper</a>
</button>

<div
style="
text-align: center;
font-family: 'American Typewriter';
font-weight: 400;
"
>
<h2>Abstract</h2>
<button style="background-color: #c3bebe; color: white;margin-right: 15px; padding: 10px 15px; border: none; border-radius: 5px;">
<a href="" style="color: white; text-decoration: none;">Dataset</a>
</button>

<button style="background-color: #000000; color: white;margin-right: 15px; padding: 10px 15px; border: none; border-radius: 5px;">
<a href="./ICASSP_2024_Appendix.pdf" style="color: white; text-decoration: none;">Appendix</a>
</button>
</div>

<div style="text-align: justify; text-justify: inter-ideograph">
Learning-based visual inspection, integrated with unmanned robotic system,
offers a more effective, efficient, and safer alternative for infrastructure
inspection tasks that are traditionally heavily reliant on human labor.
However, the potential of learning-based inspection methods remains limited
due to the lack of publicly available, high-quality datasets. This paper
presents CUBIT, a high-resolution defect detection dataset comprising more
than <strong><em>5500</em></strong> images with resolutions up to <strong
><em>8000 * 6000</em></strong
>
which covers a broader spectrum of practical situations, backgrounds, and
defect categories than existing publicly available datasets. We conduct
extensive experiments to benchmark the performance of state-of-the-art
real-time detection methods on our proposed dataset, validating the
effectiveness of it. Moreover, based on the benchmark results, we develop a
module named GIPFPP to integrate multi-scale feature, enhancing the AP by 3%
while reducing the number of parameters by 10% on baseline model.
Additionally, a real-site UAV-based inspection has been conducted to verify
the reliability of the dataset.

<div style="text-align: center; font-family: 'American Typewriter'; font-weight: 400; ">
<h2>Abstract</h2>
</div>

<div
style="
text-align: center;
font-family: 'American Typewriter';
font-weight: 400;
"
>
<h3>Sample images in CUBIT</h3>
<div style="text-align: justify; text-justify:inter-ideograph;">
Learning-based visual inspection, integrated with unmanned robotic system, offers a more effective, efficient, and safer alternative for infrastructure inspection tasks that are traditionally heavily reliant on human labor. However, the potential of learning-based inspection methods remains limited due to the lack of publicly available, high-quality datasets. This paper presents CUBIT, a high-resolution defect detection dataset comprising more than <strong><em>5500</em></strong> images with resolutions up to<strong><em>8000 * 6000</em></strong> which covers a broader spectrum of practical situations, backgrounds, and defect categories than existing publicly available datasets. We conduct extensive experiments to benchmark the performance of state-of-the-art real-time detection methods on our proposed dataset, validating the effectiveness of it. Moreover, based on the benchmark results, we develop a module named GIPFPP to integrate multi-scale feature, enhancing the AP by 3% while reducing the number of parameters by 10% on baseline model. Additionally, a real-site UAV-based inspection has been conducted to verify the reliability of the dataset.
</div>
<div style="text-align: justify; text-justify: inter-ideograph">
* The sample images in CUBIT has been shown below.* All the data are collected
by autonomous unmanned systems such as UAV and UGV. Our dataset includes
various scnarios and defect categories compared with the existing open-source
bounding-box level defect detection dataset.

<div style="text-align: center; font-family: 'American Typewriter'; font-weight: 400; ">
<h3>Sample images in CUBIT</h3>
</div>
* The sample images in CUBIT has been shown below.* All the data are collected by autonomous unmanned systems such as UAV and UGV. Our dataset includes various scnarios and defect categories compared with the existing open-source bounding-box level defect detection dataset.
<p align="center">
<img src="./sample.png" style="width: 80%" />
<img src="./sample.png">
</p>

<h3> The Comparison between Existing Bounding-box-level Defect Dataset with CUBIT
</h3>
| Dataset | Num. of Images | Resolution | Data Collection Platform | Category |
Scenario | Material | Experiments |
|---------------|----------------|----------------------|------------------------------------|------------------------|--------------------------|----------------------|-----------------------------------------------|
| RDD-2018 | 9053 | 600x600 | Smartphones | Crack, Corrosion | Pavement |
Asphalt | SSD | | RDD-2019 | 13135 | 600x600 | Smartphones | Crack, Corrosion |
Pavement | Asphalt | SSD | | RDD-2020 | 26336 | 600x600, 720x720 | Smartphones |
Crack, Pothole | Pavement | Asphalt | SSD | | RDD-2022 | 47420 | 512x512,
600x600, 720x720, 3650x2044 | Smartphones, Hand-held cameras, UAV cameras,
Google street view | Crack, Pothole | Pavement | Asphalt | - | | PID | 7237 |
640x640 | Crawled from Internet | Crack | Pavement | Asphalt | YOLOv2, Fast
R-CNN | | Murad | 2620 | up to 838x809 | Smartphones | Crack | Pavement |
Asphalt | Faster R-CNN | | CODEBRIM | 1590 | up to 6000x4000 | Hand-held
cameras, UAV Cameras | Crack, Corrosion | Bridge | Concrete | MetaQNN, ENAS | |
**CUBIT** | **5527** | **4624x3472 and 8000x6000** | **Cameras in Unmanned
Systems** | **Crack, Spallinig, Moisture** | **Building (65%), Pavement (29%),
Bridge (6%)** | **Concrete, Asphalt, Stone** | **Faster R-CNN, PP-YOLO,
PP-YOLOv2, YOLOX, YOLOv5, YOLOv7, YOLOv6, YOLOv6+GIPFPP(ours), Real-site
experiment** |


<div
style="
text-align: center;
font-family: 'American Typewriter';
font-weight: 400;
"
>
<h3>Defect Detection Framework based on CUBIT</h3>
<div style="text-align: center; font-family: 'American Typewriter'; font-weight: 400; ">
<h3>Defect Detection Framework based on CUBIT</h3>
</div>

* The visualization of defect detection framework based on CUBIT dataset is
illustrated below *, which encompasses the entire process: data collection by
autonomous unmanned system; the baseline network integrated with our GIPFPP
module; the output of defect detection results.
* The visualization of defect detection framework based on CUBIT dataset is illustrated below *, which encompasses the entire process: data collection by autonomous unmanned system; the baseline network integrated with our GIPFPP module; the output of defect detection results.
<p align="center">
<img src="./frame.png" style="width: 80%" />
<img src="./frame.png">
</p>

## The Comparison between Existing Bounding-box-level Defect Dataset with CUBIT

| Dataset | Num. of Images | Resolution | Data Collection Platform | Category | Scenario | Material | Experiments |
|---------------|----------------|----------------------|------------------------------------|------------------------|--------------------------|----------------------|-----------------------------------------------|
| RDD-2018 | 9053 | 600x600 | Smartphones | Crack, Corrosion | Pavement | Asphalt | SSD |
| RDD-2019 | 13135 | 600x600 | Smartphones | Crack, Corrosion | Pavement | Asphalt | SSD |
| RDD-2020 | 26336 | 600x600, 720x720 | Smartphones | Crack, Pothole | Pavement | Asphalt | SSD |
| RDD-2022 | 47420 | 512x512, 600x600, 720x720, 3650x2044 | Smartphones, Hand-held cameras, UAV cameras, Google street view | Crack, Pothole | Pavement | Asphalt | - |
| PID | 7237 | 640x640 | Crawled from Internet | Crack | Pavement | Asphalt | YOLOv2, Fast R-CNN |
| Murad | 2620 | up to 838x809 | Smartphones | Crack | Pavement | Asphalt | Faster R-CNN |
| CODEBRIM | 1590 | up to 6000x4000 | Hand-held cameras, UAV Cameras | Crack, Corrosion | Bridge | Concrete | MetaQNN, ENAS |
| **CUBIT** | **5527** | **4624x3472 and 8000x6000** | **Cameras in Unmanned Systems** | **Crack, Spallinig, Moisture** | **Building (65%), Pavement (29%), Bridge (6%)** | **Concrete, Asphalt, Stone** | **Faster R-CNN, PP-YOLO, PP-YOLOv2, YOLOX, YOLOv5, YOLOv7, YOLOv6, YOLOv6+GIPFPP(ours), Real-site experiment** |



<div
style="
text-align: center;
font-family: 'American Typewriter';
font-weight: 400;
"
>
<h3>
Prediction results on the test set of the proposed CUBIT-RGB-v1 defect
dataset are shown below
</h3>

<div style="text-align: center; font-family: 'American Typewriter'; font-weight: 400; ">
<h3>Prediction results on the test set of the proposed CUBIT-RGB-v1 defect dataset are shown below
</h3>
</div>

<h2> Experimental Results The evaluation results of SOTA real-time detection
</h2>
methods and YOLOv6-n with our GIPFPP module are benchmarked in the table below.
After switching from the original module to GIPFPP module, the AP of YOLOv6-n is
improved by 3%, while its number of parameters is reduced by 10%. The
enhancements made to the model will facilitate the real-time defect detection
using unmanned systems. ## The Evaluation Results of SOTA models on CUBIT |
Model | #Params.(M) | FLOPs(G) | Size | mAP$_{50}^{test}$ / mAP$_{50:95}^{test}$
| Latency(ms) |
|-----------------------------|-------------|----------|------|-----------------------------------------|--------------|
| Faster R-CNN(Res50) | 42.62 | 477.24 | 1024 | 71.5% / 43.3% | 76.9 | | PP-YOLO
| 48.99 | 136.43 | 1024 | 76.4% / 45.1% | 14.5 | | PP-YOLOv2 | 56.91 | 146.50 |
1024 | 77.3% / 47.1% | 13.8 | | YOLOv5-n | 1.76 | 4.10 | 1024 | 73.4% / 39.9% |
1.8 | | YOLOv5-s | 7.18 | 15.80 | 1024 | 78.5% / 47.2% | 3.3 | | YOLOv7-t | 6.01
| 13.01 | 1024 | 71.1% / 39.7% | 1.9 | | YOLOX-n | 2.24 | 17.75 | 1024 | 73.0% /
39.5% | 4.4 | | YOLOX-t | 5.03 | 39.00 | 1024 | 75.3% / 49.2% | 5.8 | | YOLOX-s
| 8.94 | 68.51 | 1024 | 77.9% / 49.4% | 7.6 | | YOLOv6-n(baseline) | 4.63 |
29.03 | 1024 | 76.3% / 47.9% | 2.2 | | YOLOv6-s | 18.50 | 115.64 | 1024 | 79.0%
/ 48.2% | 5.3 | | **YOLOv6-n+GIFPFF(ours)** | **4.14 (-0.49)** | **28.02
(-1.01)** | 1024 | **77.5% (+1.2) / 50.3% (+3.1)** | **2.2** |


We enlarge the
prediction results in the bottom right corner of framework images above. CUBIT
dataset covers three infrastructure types: **Building facade, Pavement**, and
**Bridge**, and aims for three types of defect: **Crack, Spalling, and
Moisture**. Rectangles indicate the output prediction box
<font color="red">Red</font> for Crack, <font color="pink">Pink</font> for
Spalling, and <font color="orange">Orange</font> for Moisture with inferred
defect type and confidence score from YOLOv6-l trained on the training set of
our proposed dataset.
## Experimental Results
The evaluation results of SOTA real-time detection methods and YOLOv6-n with our GIPFPP module are benchmarked in the table below. After switching from the original module to GIPFPP module, the AP of YOLOv6-n is improved by 3%, while its number of parameters is reduced by 10%. The enhancements made to the model will facilitate the real-time defect detection using unmanned systems.

## The Evaluation Results of SOTA models on CUBIT

| Model | #Params.(M) | FLOPs(G) | Size | mAP$_{50}^{test}$ / mAP$_{50:95}^{test}$ | Latency(ms) |
|-----------------------------|-------------|----------|------|-----------------------------------------|--------------|
| Faster R-CNN(Res50) | 42.62 | 477.24 | 1024 | 71.5% / 43.3% | 76.9 |
| PP-YOLO | 48.99 | 136.43 | 1024 | 76.4% / 45.1% | 14.5 |
| PP-YOLOv2 | 56.91 | 146.50 | 1024 | 77.3% / 47.1% | 13.8 |
| YOLOv5-n | 1.76 | 4.10 | 1024 | 73.4% / 39.9% | 1.8 |
| YOLOv5-s | 7.18 | 15.80 | 1024 | 78.5% / 47.2% | 3.3 |
| YOLOv7-t | 6.01 | 13.01 | 1024 | 71.1% / 39.7% | 1.9 |
| YOLOX-n | 2.24 | 17.75 | 1024 | 73.0% / 39.5% | 4.4 |
| YOLOX-t | 5.03 | 39.00 | 1024 | 75.3% / 49.2% | 5.8 |
| YOLOX-s | 8.94 | 68.51 | 1024 | 77.9% / 49.4% | 7.6 |
| YOLOv6-n(baseline) | 4.63 | 29.03 | 1024 | 76.3% / 47.9% | 2.2 |
| YOLOv6-s | 18.50 | 115.64 | 1024 | 79.0% / 48.2% | 5.3 |
| **YOLOv6-n+GIFPFF(ours)** | **4.14 (-0.49)** | **28.02 (-1.01)** | 1024 | **77.5% (+1.2) / 50.3% (+3.1)** | **2.2** |



We enlarge the prediction results in the bottom right corner of framework images above. CUBIT dataset covers three infrastructure types: **Building facade, Pavement**, and **Bridge**, and aims for three types of defect: **Crack, Spalling, and Moisture**. Rectangles indicate the output prediction box <font color="red">Red</font> for Crack, <font color="pink">Pink</font> for Spalling, and <font color="orange">Orange</font> for Moisture with inferred defect type and confidence score from YOLOv6-l trained on the training set of our proposed dataset.
<p align="center">
<img src="./index_show.png" style="width: 80%" />
<img src="./index_show.png">
</p>

<strong> Qualitative visualization of UAV-based real-world experiment is shown below</strong>
On the left, our multi-UAVs inspection schematics is illustrated. On the right, the
detection results of four direction façades of the building are displayed.
*Qualitative visualization of UAV-based real-world experiment is shown below* On the left, our multi-UAVs inspection schematics is illustrated. On the right, the detection results of four direction façades of the building are displayed.
<p align="center">
<img src="./goodman_zigzag.png" style="width: 80%" />
<img src="./goodman_zigzag.png">
</p>
<div style="text-align: center; font-family: 'American Typewriter'; font-weight: 400; ">
<h2>Acknowledgement</h2>
</div>
This work was supported by the InnoHK of the Government of the Hong Kong Special Administrative Region via the Hong Kong Centre for Logistics Robotics.


0 comments on commit 810bd1e

Please sign in to comment.