new examples added

cvdlab · Dec 30, 2019 · 05154b8 · 05154b8
1 parent 0b62ba1
commit 05154b8
Show file tree

Hide file tree

Showing 2 changed files with 5,106 additions and 18 deletions.
diff --git a/src/paper.tex b/src/paper.tex
@@ -32,6 +32,8 @@
 
 % M. Jirik added - code highlithing
 % \usepackage{minted}
+\usepackage[cache=false]{minted}
+% \usepackage{minted}
 %SetFonts
 
 
@@ -84,13 +86,13 @@ \section{Background}\label{sec:background}
 
 Some basic concepts of solid modeling, and in particular the foundational idea of representation scheme, as well as few basic concepts of algebraic topology, are shortly introduced in this Section, including the computation of the matrix of a boundary operator between chain spaces.
 
-\subsection{Representation Schemes}\label{sec:aaaa}
+\subsection{Representation Schemes}\label{sec:schemes}
 
 A \emph{representation scheme} for solid modeling is a mapping between a space of mathematical models and a space of symbolic representations, like generated by a formal grammar.
 Solid pointsets (i.e., `$r$-sets') are defined~\cite{Requicha:1980:RRS:356827.356833} as compact (bounded and closed) regular and semianalytic subsets of the $d$-space. A large number of representation schemes were defined in the past forty years, including the two main classes of (a) \emph{boundary representations} (`$B$-reps'), where the solid model is represented through a representation of its boundary elements, i.e.~faces, edges and vertices, and (b) \emph{decompositive/enumerative representations}, that are a decomposition of either the object or the embedding space, respectively, into a well-defined \emph{cellular complex}. In particular, a boundary representation provides a cellular decomposition of the object's boundary into \emph{cells} of dimension zero (vertices), one (edges), and two (faces). Medical imaging can be classified as \emph{enumerative representation} of cellular decompositions of organs and tissues of interest, in particular, as subsets \emph{of 3D volume elements} (voxels) from the 3D image. 
 
 
-\subsection{Linear Algebraic Representation}\label{sec:aaaa}
+\subsection{Linear Algebraic Representation}\label{sec:lar}
 
 The \emph{Linear Algebraic Representation} (\textsc{lar}), introduced in~\cite{Dicarlo:2014:TNL:2543138.2543294}, aims to represent the \emph{chain complex}~\cite{TSAS} generated by a piecewise-linear \emph{geometric complex} embedded either in 2D or in 3D. In few words, it gets a minimal characterization of geometry and topology of a cellular complex, i.e.~the embedding mapping $\mu : C_0 \to \E^d$ of 0-cells (vertices), as well a description of $(d-1)$-cells as subsets of vertices, and is able to return the whole chain complex 
 %\[ 
@@ -166,11 +168,11 @@ \subsubsection*{Construction of boundary matrix $\partial_d$}
    \label{fig:boundary_matrix_4x4x4}
 \end{figure}
 
-\subsection{Multiindices from Cartesian indices}\label{sec:aaaa}
+\subsection{Multiindices from Cartesian indices}\label{sec:inds-from-cart}
 
 In order to utilize the topological algebra shortly recalled in this paper, we need to explicitly sort the cells of the various dimensions into linearly ordered sequences, possibly according to the linear order their information is linearly accommodated in computer storage. 
 
-\subsection{Taubin Smoothing}\label{sec:aaaa}
+\subsection{Taubin Smoothing}\label{sec:taubin}
 
 Every boundary chain extracted from an image block $\B(i,j,k,n)$ is a \emph{2-cycle}, i.e., a closed 2-chain---in other words, a 2-chain with empty boundary. Such 2-cycles are joined together by removing the double 2-cells (at the boundaries of adjacent bricks) after having suitably shifted their indices to an unique linear representation of the whole image. The resulting raster surface is made by mutually orthogonal raster facets, that must be smoothed in order to get a fair surface. A linear time and space algorithm for this purpose is the Laplacian smoothing, which iteratively  moves each vertex (0-cell) to the centroid of its neighbors. A well known weakness of this simple algorithm is the asymptotic convergence of the whole mesh to a single point, resulting in unfair size reduction even after few iterations.  Conversely, the Taubin smoothing algorithm~\cite{Taubin:1995:SPA:218380.218473,egst.20001029} alternates two Laplacian smoothing steps with \emph{shrink} and \emph{inflate} effects respectively, with the result of delivering pretty invariant sizes and volume of the smoothed mesh. The best results are obtained on meshes which have small variations of edge length and face angles, like for surfaces extracted from 3D raster images, as in our case.
 
@@ -181,7 +183,7 @@ \subsection{Taubin Smoothing}\label{sec:aaaa}
 
 \section{Block-parametric design}\label{sec:filter}
 
-\subsection{Block decomposition}\label{sec:bbbb}
+\subsection{Block decomposition}\label{sec:block-decomposition}
 
 We assume that medical devices produce 3D images with lateral dimensions that are integer multiples of some powers of two, like 128, 256, 512, etc.
 Any cuboidal portion of image is completely determined by the Cartesian indices of its voxels of lowest and highest indices, and extracted by multidimensional array \emph{slicing} as $Image([\ell_x : h_x, \ell_y : h_y, \ell_z : h_z])$.
@@ -204,7 +206,7 @@ \subsection{Block decomposition}\label{sec:bbbb}
 
 \subsection{Block operator }\label{sec:block}
 
-\subsubsection*{Chain coordinates }\label{sec:bbbb}
+\subsubsection*{Chain coordinates }\label{sec:chain-coords}
 We are going to treat each image block independently from each other. Hence we map each image block $\B(i,j,k,n)$ to the linear \emph{chain} space $C_2$ of dimension $n\times n\times n$, using coordinate vectors $c\in \mathbf{2}^{n^d} := \{0,1\}^{n^d}$, where the basis element $c \in C_2$ is mapped via Cartesian-to-linear map to the binary vector 
 \[
 Image(h,k) \mapsto c_{h,k} := [0 \cdots 0\ 1\ 0 \cdots 0] \in \B^{n\times n}
@@ -213,14 +215,14 @@ \subsubsection*{Chain coordinates }\label{sec:bbbb}
 
 Therefore, each pixel (or voxel) in a block image will be seen as a basis binary vector in $C_2$, and each subset of image elements, as the corresponding binary vector in $C_2$, with many ones as the cardinality of the subset.
 
-\subsubsection*{Boundary operator }\label{sec:bbbb}
+\subsubsection*{Boundary operator }\label{sec:boundary-operator}
 For a fixed block size $n$, the boundary operator $\partial_d : C_d\to C_{d-1}$, with $d\in\{2,3\}$, will be constructed once and for all using the algorithm given in~\cite{}, and inlined in the generated boundary extraction code.
 
 It is easy to see that the operator's matrix $[\partial_d]$ is \emph{very sparse}, since it contains $2\times d$ non-zero elements (ones) for each column (of length $n^d$), i.e.~4 ones and 6 ones for the 2D and 3D case, respectively. In fact the matrix of a linear operator between linear spaces contains by columns the basis element of the domain space, represented in the target space. In our case, the former is an image element (2-cube or 3-cube), represented as the chain of its boundary---i.e. either a 1-cycle of 4 edges, or  a 2-cycle of 6 faces, respectively.  
 
 The number of rows of $[\partial_d]$ equates the dimension of the linear space $C_{d-1}$, i.e.~the number of $(d-1)$-cells---elementary $(d-1)$-chains---in the cellular partition of the image. To compute their number, we act in two steps. (a) First we map one-to-one the $n^d$ $d$-cells with $d$ adjacent $(d-1)$-cells, so getting $d\,n^d$ distinct basis elements of $C_{d-1}$. (b) Then we complete this bases by adjoining $n^{d-1}$ boundary elements for each of the $d$ dimensions of the image, so providing further $d\,n^{d-1}$ basis elements for $C_{d-1}$. The dimension of $C_{d-1}$, and therefore the number of rows of $[\partial_d]$ matrix is $d\,(n^{d-1}+n^{d}) = d\,n\,(1+n)^{d-1}$. The number of column equates the number of basis elements of $C_d$, i.e.~the number $n^d$ of block elements.
 
-\subsubsection*{Sparsity and size of boundary matrix }\label{sec:bbbb}
+\subsubsection*{Sparsity and size of boundary matrix }\label{sec:bm-size}
 
 As we have seen, we have $2d$ non-zero elements for each column of $[\partial_d]$, so that their total number is $2d\,n^d$. The number of matrix element is $d\,n\,(1+n)^{d-1} \times n^d$, giving a ratio of 
 \[
@@ -235,6 +237,7 @@ \subsubsection*{Sparsity and size of boundary matrix }\label{sec:bbbb}
 In conclusion, for block size $n=64$, the matrix $[\partial_d]$ requires for 2D images $9\times 64^2=36,864$ memory elements, and for 3D images $13\times 64^3=3,407,872$ memory elements. Counting the bytes for the standard implementation of a sparse binary matrix (1 byte for values and 8 bytes for indices) we get $(18d+8)n^d$ bytes, giving $176$\,KB for 2D and $15.872$\,MB for 3D.
 
 
+%TODO fix the name
 \begin{figure}[htbp] %  figure placement: here, top, bottom, or page
    \centering
    \includegraphics[width=0.4\linewidth]{figs/grid.pdf} 
@@ -244,7 +247,7 @@ \subsubsection*{Sparsity and size of boundary matrix }\label{sec:bbbb}
 
 
 
-\subsection{Block boundary mapping}\label{sec:bbbb}
+\subsection{Block boundary mapping}\label{sec:block-mapping}
 
 Here we refer directly to the 3D case.
 Let us call \emph{segment} the bulk content $S$ of interest within the input 3D image of size $(u,v,w)$. Our aim is to compute the segment boundary $\partial_3 S$. 
@@ -290,7 +293,7 @@ \subsubsection*{Surface assembling}
 \]
 
 
-\subsection{Block-level parallelism}\label{sec:bbbb}
+\subsection{Block-level parallelism}\label{sec:block-parallelism}
 
 In the computational pipeline introduced in this paper, several steps can be efficiently performed in parallel at image-block level, depending on the embarassingly data parallel nature of the problem. In particular, little effort is needed to separate the problem into a number of parallel tasks $S_{i,j,k}$, using multiarray slicing. The granularity of parallelism, depending on the block size $n$, is further enforced by the computation of a single boundary matrix $[\partial_d(n)]$ depending on $n$, so that the initial communication cost of broadcasting the matrix to nodes can be carefully controlled, and finely tuned depending on the system architecture. The whole approach is appropriate  for SIMD hybrid architectures of CPUs and GPUs, since only the initial block setup of boundary matrix and image slices, as well the final collection of computed surface portions, require inter-process communication.
 
@@ -305,7 +308,8 @@ \section{Julia implementation}\label{sec:julia}
 \subsection{Parallel workflow}\label{sec:implementation}
 
 
-\subsubsection*{Workflow setup}\label{}The functions in this preliminary step include:
+\subsubsection*{Workflow setup}\label{sec:workflow}
+The functions in this preliminary step include:
 \begin{enumerate}
 
 \item input of 3D medical image $\mathcal{I}$ with \emph{shape} $(\ell_1, \ell_2, \ell_3)$, such that: $\mathcal{I} = [\ell_1]\times[\ell_2]\times[\ell_3]$, where $[\ell_k] = [1,2,\ldots,\ell_k]$;
@@ -333,7 +337,7 @@ \subsubsection*{Workflow setup}\label{}The functions in this preliminary step in
 With $size=64$, the number of non-zeros within the sparse matrix $[\partial(64^3)]$ is $\mathtt{nnz} = 4 792 266$, for a memory size of $9\times \mathtt{nnz}+8\times 262144 \simeq 45$ MB. The memory size of the sparse matrix is computed by considering $8+1$ bytes for non-zero element (which are exactly 6 per row), plus 8 bytes per each index of column start.  
 
 
-\subsubsection*{Job enqueuing}\label{}
+\subsubsection*{Job enqueuing}\label{sec:job-enq}
 Communication and data synchronization may be managed through \emph{Channels}, which are the FIFO conduits that may provide producer/consumer communication. Overall execution time can be improved if other tasks can be run while a task is being executed, or while waiting for an external service/function to complete. The single work items of this stage follow:
 \begin{enumerate}
 
@@ -346,7 +350,7 @@ \subsubsection*{Job enqueuing}\label{}
 \item enqueing the job (as a sequence of integer positions for the non-zeros image elements aligned in a memory buffer of proper \texttt{Channel} type).
 \end{enumerate}
 
-\subsubsection*{3-Chain encoding}\label{}
+\subsubsection*{3-Chain encoding}\label{chain-coding}
 The interesting part of the \emph{Image} $\mathcal{I}$ is called \emph{Segment} $\mathcal{S}$. The goal of the whole \emph{workflow} is to extract a \emph{boundary model} of $\mathcal{S}$ from $\mathcal{I}$. The portion of $\mathcal{S}$ inside $\mathcal{B}$, will be denoted as $\mathcal{S}(B)$.
 Each block $\mathcal{B}$ of the 3D image must by converted into the \emph{coordinate representation} of a vector $\nu\in C_3$ in the linear space  of 3-chains. 
 
@@ -363,7 +367,7 @@ \subsubsection*{3-Chain encoding}\label{}
 
 \end{enumerate}
 
-\subsubsection*{SpMM Multiplication}\label{}
+\subsubsection*{SpMM Multiplication}\label{SpMM-multiplication}
 According to the current literature~\cite{} it is more convenient to execute SpMV (sparse matrix-vector) multiplications than SpMSpV (sparse matrix-sparse vector) multiplications. Since we have 256 such jobs (one multiplication per block) to perform in the standard setting of the algorithm\footnote{
 Size of block $64^3$; size of image $512^2\times 256$.
 }, or more in case of either smaller blocks or image greater than the standard one, this stage must be evidently parallelized and carefully tuned, possibly by using the GPU, if available.
@@ -377,7 +381,7 @@ \subsubsection*{SpMM Multiplication}\label{}
 
 \end{enumerate}
 
-\subsubsection*{2-Chain decoding}\label{}
+\subsubsection*{2-Chain decoding}\label{sec:two-chain-decoding}
 Each multiplication of $[\partial_B] : C_3 \to C_2$, times a 3-chain $\nu\in C_3$, produces a 2-chain  $\sigma\in C_2$, i.e.~the \emph{coordinate representation} of the \emph{boundary vector} $\sigma\in C_2$.  The inverse of the coding algorithm is executed in the present stage.  This process can also be partially superimposed in time with the previous ones, depending on the size of the memory buffers used to feed the CPU cores or the GPUs and get their results. Some elementary steps follow:
 
 \begin{enumerate}
@@ -392,7 +396,7 @@ \subsubsection*{2-Chain decoding}\label{}
 
 A Julia's vectorized pipeline dataflow seems the more appropriate implementation model for the job of each worker.
 
-\subsubsection*{Assembling and artifact filtering}\label{}
+\subsubsection*{Assembling and artifact filtering}\label{sec:artifact-filtering}
 The results of the previous stages can be described as a \emph{collection of sets} of \emph{geometric quadrilaterals (quads)}, each one encoded as an array of quadruples of integer indices, pointing to the linear array of grid vertices associated to the image block $\mathcal{B}$.  In other words, \emph{all quads} of \textbf{each job} are now given in the \textbf{same} \emph{local coordinates}.  Besides to put each partial surface $\mathcal{S}(B) = (\texttt{V}_B, \texttt{FV}_\sigma)$ in the global coordinate system of the image, the present stage must eliminate the redundant boundary features possibly generated at the edges of the partial surface $\mathcal{S}(B)$ within each block $\mathcal{B}$ such that $B \cap \mathcal{I} \not= \emptyset$:
 
 \begin{enumerate}
@@ -406,7 +410,7 @@ \subsubsection*{Assembling and artifact filtering}\label{}
 
 \end{enumerate}
 
-\subsubsection*{Smoothing}\label{}
+\subsubsection*{Smoothing}\label{sec:smoothing}
 The final smoothing of the generated surfaces cannot be performed block-wise, since this would introduce smoothing artifacts at the block boundaries. Anyway, the Taubin smoothing~\cite{} can be performed in parallel, since for each vertex in the final surface (except eventually the ones on the image $\mathcal{I}$ boundaries) it essentially consist in computing a new position as a proper average of its neighborhood vertices, i.e.~by applying a discrete Laplacian operator.  Some appropriate set of workers may so be assigned the task of generating iteratively a new position for the vertices they take cure of. In particular, we have:
 \begin{enumerate}
 
@@ -452,6 +456,33 @@ \subsection{Performance analysis}\label{sec:analysis}
 %\input{examples}
 \section{Examples}\label{sec:examples}
 
+\begin{verbatim}
+using Distributed
+addprocs(3)
+using LarSurf
+
+LarSurf.lsp_setup([20, 20, 20]) # set the block size
+
+segmentation = LarSurf.generate_truncated_sphere(10)
+V, FV = LarSurf.lsp_get_surface(segmentation)
+FVtri = LarSurf.triangulate_quads(FV)
+Vs = LarSurf.Smoothing.smoothing_FV_taubin(V, FV, 0.4, -0.3, 50)
+
+objlines = LarSurf.Lar.lar2obj(Vs, FVtri, "truncated_sphere_tri_taubin.obj")
+\end{verbatim}
+
+We include more examples of extracted surfaces based on Micro-CT data of corrosion casts of pig liver.
+\cite{eberlova2017use}.
+% The resolution
+Lar-surf 
+
+
+\begin{minted}{python}
+
+
+
+using Pkg
+\end{minted}
 % \begin{minted}[breaklines,escapeinside=||,mathescape=true, linenos, numbersep=3pt, gobble=2, frame=lines, fontsize=\small, framesep=2mm]{julia}
 % Your awesome julia code here\end{minted}
 
@@ -466,6 +497,19 @@ \section{Examples}\label{sec:examples}
 \label{fig:example_porta}
 \end{figure}
 
+\begin{figure}
+\centering
+\includegraphics[width=0.3\textwidth]{figs/liver_01_red_3.png} 
+\vspace{0.05\textwidth}
+\includegraphics[width=0.3\textwidth]{figs/portalvein_01_yellow_3.png} 
+\vspace{0.05\textwidth}
+% \includegraphics[height=0.3\textwidth]{figs/nrn10_100_green.png} 
+\includegraphics[height=0.3\textwidth]{figs/nrn10_100_low_res.png} 
+%\includegraphics[scale=1]{input/ircad_comparison.pdf} 
+\caption{Triangulated isosurface of portal vein calculated with LAR-SURF}
+\label{fig:example_ircad}
+\end{figure}
+
 
 
 %\input{conclusion}
@@ -510,6 +554,8 @@ \subsection{Definitions}
 
 \subsection{3D-Ircadb Dataset}
 
+For perfomance analysis the public dataset from Research Institute against Digestive Cancer (IRCAD) \cite{ircadb} was used. The table \ref{tab:ircad1} and \ref{tab:ircad2} describe the dataset.
+
 \begin{table}
 \input{input/dataset_ircad}
 \label{tab:ircad1}
@@ -522,6 +568,7 @@ \subsection{3D-Ircadb Dataset}
 
 \clearpage{}
 \bibliographystyle{alpha}
-\bibliography{library,TSAS-2019}
+% \bibliography{library,TSAS-2019,references}
+\bibliography{TSAS-2019,references}
 
 \end{document}