I love mathematics – because it is the only “language” that humans can communicate each other without the barrier of their native language; because it is as beautiful as art and you can really enjoy the joy of mastering a mathematical topic (just like playing a musical instrument or painting a picture); because it is both pure (“simple enough but not simpler” – to a mathematician) and applied (“anything you can do, I can do it better” – to a math-trained mind). I can only look up to the greatest mathematical minds in human history and thank them for revealing so many elegant concepts and theorems to their descendants. I do believe what Erdos call “proofs in the book” – mathematical reasoning represents the highest level of intellectual gymnastics and any humble mind cannot help wondering if the creator knows a simpler proof. I am grateful to my own creator that I am talented enough to appreciate the beauty of math even though at a level of lower than professional players because …

I also hate the mathematics. It is a language abused by some to disguise the truth; it is often used as “the emperor’s new cloth” (indeed it works even better because a mathematically-decorated paper can more easily fool less-mathematically-trained minds); it deals with mentally-reproducible objects only – if its connection with the physical world is cut off, one can inevitably get trapped in delusion. Abusing mathematical language is like pouring the paint to a cloth and calling it a creative art; legendary physicist Richard Feynman has warned us a long time ago the law of gravity can be derived from at least three different math formulations and they are all equivalent. So no mathematical topic/tool is more fundamental/important than the other (e.g., algebra vs. geometry). I still detest a probability course I took in my graduate study when the instructor claimed that I was “completely confused by the advanced notation of probability theory”. To me, if the depth of a topic has to be conveyed by the  mastering of some specific symbol system, it is no better than a dried meat/fish with good nutrition being gone already.

So how do I love and hate mathematics at the same time? For every piece of mathematical topic (e.g., definition, theorem, algorithm), I will find out its history and understand its context in the first time. I firmly believe good math is a long-living meme with a distinguished character from bad math. A mathematician does not need an apology as long as he/she recognizes there is no better mathematics (e.g., advanced probability theory has a higher rank than elementary probability theory). Meanwhile, for every piece of scientific topic (e.g., in physics, chemistry, biology and sociology etc.), I will look for the simplest mathematical model or language for communication. I will keep in mind “all models are wrong, some are useful” (George Box). I will not be fooled by better experimentally reproducible results (often on a highly limited test data set) alone but try to understand their implications (i.e., underlying models or hypothesis) in a mentally reproducible manner. I might never be a good mathematician or a good scientist but I will be happy to be the king of the middle-ground – maybe that is what a good engineer is about.


This blog is a follow-up of my previous blog: https://masterxinli.wordpress.com/2009/09/01/the-relationship-of-mathematics-to-image-processing/. I had sent out some cautious note about abuse of mathematics in image processing in that blog. The tone might sound a little bit overshoot; so I will “correct” myself  in this blog. As I have stated in the class, mathematics “turns art into science” – so even though I maintain my position that math is only a tool, it is always a good idea to learn to use more and different tools. Here is a list of the tools relevant to image processing:

1. Euler-Lagrange equation in variational calculus.

2. EM algorithm in statistics.

3. Eigenvalues and Laplacian of a graph in spectral graph theory.

4. Convex optimization in optimization theory.

5. Finite difference method in numeral analysis.

6. Fixed-point theorem in dynamical systems.

7. LLN and heavy-tail distribution in probability theory.

8. The first fundamental form of surface in differential geometry.

9. Manifold in topology.

10. Singular value decomposition in matrix theory.

In the first week’s lectures, I have emphasized the difference between mentally reproducible and experimentally reproducible research (more or less: theory vs. experiment). Mathematical theories are mentally reproducible objects – you can understand them if you have the required background and think hard enough. From this perspective, mathematics is no different from art such as Van Gough’s painting – you can appreciate its beauty because you can mentally reconstruct the mathematical or artistic sensation from the objects presented to you. This is the simple reason why Gaussian distribution or Da Vinci’s Mona Lisa can become the cultural heritage and passed on from generation to generation (called “meme” by Richard Dawkins’ words in his book “Selfish Genes”).

Engineering deal with experimentally reproducible objects – the make of a car, a smartphone, a telescope etc. According to wikipedia, the ultimate goal of engineering is to “safely realize improvements to the lives of people”. Therefore, it is not surprising to see engineering inventions often have much less life expectations than scientific theories or artistic products. We see cassette replaced by CDs, film cameras replaced by digital ones, dial-up modem replaced by high-speed internet connections. It is Darwin’s evolutionary law applied to the technological world. If your hero is someone like Bill Gates or Steve Jobs, their greatness simply lies in their vision in creating the right products. Experimental reproducibility is a necessary (but not sufficient) condition for a product’s commercial viability.

What is good engineering? There are various aspects – sometimes being the first is important (think of the invention of telephone by Bell – that is why patent or intellectual property is valued by industry); sometimes you do not need to lead the race (think of the invention of iphone – as long it is good design, you can still catch up). I emphasize the importance of experimental reproducibility to good engineering because it is still a sad fact that research reproducibility has not become the standard norm for all technical communities (please refer to the supplementary reading I have posted to the course website). So it has become difficult (especially for young minds entering the field) to tell the real progress from the bogus one from the bulk of published papers every year.

Recently I have been given some thoughts to the concept of memory. I started from Rose’s award-winning book The Make of Memory: from Molecules to Brain and spent quite some time on understanding the phenomenon hysteresis. Then I came across Smolin’s entertaining book The Life of the Cosmos and as I read it through, “to understand a quark or an electron, we may have to know something about the history or the organization of the universe.”

If the above hypothesis is true,  the environment influencing elementary particles will not be bounded by stars or galaxies but the objects at a much larger spatial scale. And the underlying reason is that the universe of today is the evolutionary result of interactions among elementary particles through billions of years. The subtle relationship between space and time has not been recognized by humans until last century. It is very likely that our understanding about the nature would experience even more dramatic changes in the future – maybe centuries, maybe millenniums.

Toward this improved understanding, one fundamental question seems to be: does universe have memory? Here by memory I mean the physical laws/principles of today are the same as those a long long time ago. Of course, such question cannot be answered scientifically but only philosophically in present time. But it is at least aesthetically appealing to believe that the universe has followed one or few universal principles. If that is not the case, how would the universe or God decide which principle to use at a given time and what makes him to change his mind tomorrow?

1. Forgot to convert to double. I mentioned that if you use ‘imread’ to read in an image, it will be in uint8 format. This format is less reliable than double because it does not “support” floating-point operation. It is a good idea to convert images to double format before any calculation of quantities such as MSE.

2. Pitfall with imwrite. Sometimes imwrite(x,…) does not produce the correct result. You need imwrite(x/255,…) instead.

3. imrotate: there are two directions – clockwise (negative angle) and anti-clockwise (positive angle).

4. hough: “[H, THETA, RHO] = HOUGH(BW) computes the SHT of the binary image BW. THETA (in degrees) and RHO are the arrays of rho and theta values over which the Hough transform matrix, H, was generated.” If you have carefully read this help information, THETA and RHO are not relevant to peak detection in the Hough transform domain.

5. noise power calculation: given a clean image x and a noisy image y (assuming impulse noise), the noise power is the percentage of x~=y (not x==y).

According to wiki, “In physics, the principle of relativity is the requirement that the equations, describing the laws of physics, have the same form in all admissible frames of reference.” The importance of reference frame has been less appreciated in other sciences and it is the purpose of this blog to understand its relevance to visual perception.

Why do we need the frame of reference? A common example related to motion-related  illusion – sometimes one feels an still train or airplane is moving because the observe his/her own motion. The underlying reason for such illusion appears to be the lack of coordination between vestibular system and vision system. It also shows the relativity of motion perception – in the above illusion example, ambiguity is often easily resolved if the observer looks at somewhere else (change of the reference frame).

The story does not end here. Pioneering studies by Nobel Laureates Hubel and Wiesel in 1950s have shown the abundance of movement-sensitive cells in visual cortex of a cat. It is easy to understand their role in dorsal pathway from the perspective of motion detection for the survival but how about their role in ventral pathway – how did movement-sensitive cells analyze a stationary landscape? It involves both saccade and microsaccade which really echoes J. Gibson’s saying “We move because we see, we see because we move”. More subtle implication lies in the scale of movement on visual perception – if saccade (global motion) is for the purpose of accumulating local information into a holistic understanding; the role played by microsaccade (local motion) is more relevant to the functioning of ventral pathway involving object recognition.

In other words, out eyes seldom process “absolutely stationary” images (the cells simply won’t fire); the perception of even stationary scene is the consequence of moving eyes around (both globally and locally). Therefore, it is really a lame approach to understand the biological counterpart of image processing because there is none. A biologically inspired approach toward image processing is to cast images as the subspace of video and understand color and texture along with motion and disparity. The principle of relativity implies that both saccade and microsaccade are important to the functioning of movement-sensitive cells because they sense “relative” changes all the time.

This blog is a continuation of my previous blog on the relationship between mathematics and image processing and aims at more technical virtuosity than conceptual understanding.

What are the most influential works by mathematicians on image processing research in the past three decades? I would say MRF in 1980s; Wavelets and PDEs in 1990s; still-too-early-to-tell in 2000s. Before Geman and Geman’s PAMI paper in 1984, image processing was still an art with little scientific insight. It is Gemans’ paper that showed image processing can also be tackled by tools from statistical mechanics. Even though the analogy between pixels of an image and gas particles of an Ising model is artificial, this work has long-lasting impact. It should be noted that the mathematics in this paper is not entirely new (several results such as Gibbs sampling have been established by other researchers before) but it is the first successful application of statistical physics in image processing. From a historical perspective, it is not surprising to see Ising model, which has dramatic impact on modern physics, is also applicable to other scientific fields (MRF, Hopfield network and Boltzmann machine are all related to Ising model).

If one predicts the future of image processing in 1980s based on the history of theoretical physics before 1980s, he might say: renormalization group should be a cool idea for non-physicists to explore because the decade of 1970s belonged to renormalization group (RG) – a mathematical apparatus that allows one to investigate the changes of a physical system as one views it at different distance scales. Indeed,  the decade of 1990s belonged multiscale modeling of images: wavelet and PDE -based approaches both address the issue of scale though from different perspectives. Wavelets are local schemes whose effectiveness lies in the good localization property of wavelet bases; PDE-based models are global schemes which is often characterized by minimization of certain energy functional (note that they do admit local implementation based on diffusion). It is safe to say that both wavelet-based and PDE-based models have high impact in image processing and their underlying connection has been established for certain specific cases. The unsettling issue is the varying role of locality – in physics, it is a fundamental assumption that interactions are local; but such assumption does not hold for images or more precisely for visual perception of image signals because of nonlocal interaction among neurons.

Since 1998, nonlocal image processing has been studied under many disguises – e.g., bilateral filtering, texture synthesis via nonparametric sampling, nonlocal mean denoising (more sophisticated version is BM3D denoising) and nonlocal TV et al. All of a sudden, lots of experimental findings seem to suggest something beyond the scope of wavelet and PDE. What is it? I have been baffled by this question for many years and I am still groping for the answers – one promising direction is to view images as fixed point of some dynamics system (abstraction of neural systems) characterized by similitude and dissimilitude (abstraction of excitatory and inhibitory neurons). History of theoretic physics cannot help here because as the complexity of system increases, physics become chemistry and chemistry evolves into biology. The new breakthrough, if I can make it, will not come from mathematical virtuosity (I simply am not even close to Daubechies or Mumford) but from physical intuition. I think there exists a universal representation of the physical world and a universal cortical algorithm for neurons to encoder the sensory stimuli. From this perspective, image processing is really paving one possible path toward understanding fundamental phenomenon such as biological memory and its implications into intelligence. Mathematics surely will still play a role of communicating my findings to others but hopefully I might only need  math skills of Shannon or Ashby’s level for this task.

Last, I happened to learn that the first Millennium Prize of Clay Institute was awarded to Perelman for his resolution of Poincare’s conjecture. There is a very-well written report about this famous conjecture which I think even non-mathematicians like myself will enjoy reading. My instinct tells me someone might have already applied this new fancy tool of Ricci flow into image processing. Indeed several Jewish researchers have pursued this direction but from their preliminary findings, I think they won’t go very far unless they can supply the Ricci flow with some physical intuition first. It has been suggested “an issues-directed approach is more fruitful than a method-directed approach to science” which really echoes the point of this blog.