Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  ·...

42
FEATURE EXTRACTION FROM Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori Committee Members Dr. Robert Roberson Associate Professor, School of Life Sciences Arizona State University Tempe, AZ 85287- 1804 Dr. Rosemary Renaut Director, Computational Biosciences Program Arizona State University Tempe, AZ 85287- 1804 Dr. Kenneth Hoober Professor, School of Life Sciences Arizona State University Tempe, AZ 85287- 1804 ARIZONA STATE UNIVERSITY August 2004

Transcript of Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  ·...

Page 1: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

FEATURE EXTRACTION FROM

Synechocystis sp. PCC 6803 CELL IMAGES

by

Shylaja Kokoori

Committee Members

Dr. Robert Roberson

Associate Professor, School of Life Sciences

Arizona State University

Tempe, AZ 85287- 1804

Dr. Rosemary Renaut

Director, Computational Biosciences Program

Arizona State University

Tempe, AZ 85287- 1804

Dr. Kenneth Hoober

Professor, School of Life Sciences

Arizona State University

Tempe, AZ 85287- 1804

ARIZONA STATE UNIVERSITY

August 2004

Page 2: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

ABSTRACT

Synechocystis sp. PCC 6803 has been one of the most popular organisms

for genetic and physiological studies of photosynthesis because of its

capability of growth heterotrophically at the expense of glucose and

because of the availability of its entire genome sequence. Dr. Roberson’s

laboratory in the School of Life Sciences is conducting electro microscopy

studies using three- dimensional (3D) reconstruction of electron

tomographic data of wild type Synechocystis cells and selected mutants.

Their goal is to identify mechanisms of photosynthesis and thylakoid

membrane biogenesis in this cyanobacterium. Accurate extraction of

important inclusions from electron tomographic data, by segmenting

images and performing 3D analysis and modeling, will help in achieving

the goals. Currently, however segmentation in electron tomography is

almost exclusively a manual operation. As a result, it is often the most

time consuming and subjective step in the data analysis process. This

project is an attempt to automatically segment features such as ribosomes,

thylakoid membranes and cytoplasmic filaments from tomographic data of

Synechocystis cells. Morphological operations and mathematical methods

are used in order to provide biologists information in a timely manner.

II

Page 3: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

TABLE OF CONTENTS

PAGE

LIST OF FIGURES IVChapter

1

GOALS OF PROJECT 1

Chapter

2

ABOUT Synechocystis 2

Chapter

3

INTRODUCTION AND OVERVIEW 3

Chapter

4

ELECTRON TOMOGRAPHY 7

Chapter

5

DATA PREPARATION 7

Chapter

6

DATA 8

Chapter

7

METHODS 11

7.1 First Order Differential Method 127.2 Convolution 137.3 Gaussian Smoothing 157.4 Seeded Region Growing 157.5 Watershed Segmentation 167.6 Eigen Value Analysis of Hessian

Matrix

18

Chapter

8

ALGORITHM 19

Chapter

9

TOOL DESCRIPTION 21

Chapter

10

IMPLEMENTATION 23

Chapter

11

RESULTS 25

III

Page 4: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

Chapter

12

CONCLUSIONS AND FUTURE WORK 30

REFERENCES 32

IV

Page 5: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

LIST OF FIGURES

PAGESFigure 1: Electron tomographic slice through the

Synechocystis cell

2

Figure 2: Histogram thresholding. (a)original image (b)thresholded image t=125 (using Vigra) and (c)Histogram bar plot of original image (using Matlab)

4

Figure 3: Edge detection (a) original image (b) after applying

Canny edge detector σ = 3, gradient threshold = 3 (using Vigra)

6

Figure 4: High magnification showing ribosomes 9Figure 5: High magnification showing thylakoid membranes 9Figure 6: High magnification showing filaments 10Figure 7: 1st and 2nd derivative of an edge illustrated in one

dimension

11

Figure 8: Synechocystis cell image after Gaussian smoothing

operation, σ = 3

13

Figure 9: Synechocystis cell image gradient after Gaussian

smoothing operation, σ = 3

13

Figure 10a: Example image and kernel to illustrate

convolution

14

Figure 10b: Image overlayed with kernel 14Figure 11: Watershed segmentation. (a) Gray level gradient of image data (b) Displays local minima and watershed region in the image data

16

Figure 12: Watershed segmentation on Synechocystis cell. (a) Original image (b) edge image after performing watershed transformation (c) after region growing (enlarged image)

17

Figure 13: Labeled region after watershed transformation 19

Figure 14: Thresholded image after watershed segmentation 20Figure 15: Extracted thylakoid membranes 21

V

Page 6: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

Figure 16: Tool user interface (a) Displays dialog box to set parameters(b) Window displaying opened image file

22

Figure 17: (a) Original Synechocystis cell image (b) extracted ribosomes

25

Figure 18a: Manually- segmented model of a zoomed - in portion of the image

25

Figure 18b: Ribosomes detected using the tool overlayed on the real image

26

Figure 19: (a) Original Synechocystis cell image and (b) extracted thylakoid membranes overlayed on the original image

28

Figure 20: (a) Original Synechocystis cell image and (b) extracted filaments overlayed on the original image

29

VI

Page 7: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

1

GOAL OF PROJECT

The goal of this project is to help understand the basic cell biology of

photosynthesis of the unicellular cyanobacterium Synechocystis sp. PCC

6803 by providing algorithms that enable automatic extraction of

inclusions such as ribosomes, thylakoid membranes and cytoplasmic

filaments from tomograms. Currently, segmentation is the bottleneck in

electron tomographic data analysis because it is almost entirely a manual

operation. This project is an attempt to design and develop a tool that

automatically segments inclusions in the cell by making use of

morphological operations and mathematical methods. Extracting features

from the cell in an accurate and timely manner for 3D analysis and

modeling, will aid with understanding the basic cell biology of this

organism and, specifically, the mechanisms of photosynthesis and

thylakoid membrane biogenesis. Second, the tool has to provide a friendly

user interface that will help the user to use the tool efficiently. The main

goal here is to develop software, which is reliable and expandable so that

in the future new features can be added effortlessly.

Page 8: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

2

ABOUT Synechocystis :

Cyanobacteria are photoautot rophic organisms, which are capable of

performing oxygen- producing photosynthesis similar to plants [18].

Synechocystis sp. PCC 6803 is one such unicellular non- nitrogen - fixing

cyanobacterium, which is an inhabitant of fresh water [19]. This

cyanobacterium is capable of growing heterotrophically at the expense of

glucose, which has made it a desirable model organism for genetic and

physiological studies of photosynthesis. This organism displays a unique

combination of highly desirable molecular - genetic, physiological, and

morphological characteristics: it is spontaneously transformable,

incorporates foreign DNA into its genome by double- homologous

recombination (making gene knock- outs and replacements clear- cut), can

grow under many different physiological conditions (such as photoauto /

mixo/ heterotrophically), is small (~ 1.5 mm in diameter) and is suitable

for quantitative 3D ultra structural analysis. This, coupled with the fact

that cyanobacteria are closely related to the ancestors of chloroplasts,

makes Synechocystis an ideal experimental system [16]. Fig. 1 displays a

tomographic slice of Synechocystis cell containing inclusions such as

ribosomes, thylakoid membranes and cytoplasmic filaments.

Page 9: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

3

Figure 1: Electron Tomographic slice through the Synechocystis cell. Scale bar = 100 nm

INTRODUCTION AND OVERVIEW:

Image segmentation is an important step in image analysis. It is, however,

a difficult step because it depends largely on the data and the application.

For a biomedical application segmentation may involve identifying the

shape and size of a tumor. For a geographic application the task may

include identifying the location of roads in an area or the location of weeds

in a lake. For tomographic data segmentation might involve extracting

inclusions for further analysis. The ultimate goal, generally, is to reduce

the given image to non- intersecting regions of interest [3].

The most common methods used to segment images are (a) histogram

thresholding, (b) edge based methods and (c) region growing methods [4].

HISTOGRAM THRESHOLDING:

This is one of the easiest methods of segmentation and is the ideal

method to segment objects from a distinct background. There are different

types of adaptive- thresholding methods available which work based on

Page 10: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

4

maxima or minima of the histogram, or by making use of other histogram -

related functions. Since thresholding does not consider any other factor

like shape or position it is suitable only for simple images [4]. Fig. 2a, Fig.

2b and Fig. 2c show the original image, thresholded image at a grayscale

value of 125 and a histogram bar plot of the original image respectively.

Thresholding could be used to extract the ribosomes but largely depends

on the threshold value being selected. A high threshold value results in

eliminating some of the ribosomes from the image and a low threshold

value results in extracting noise and unwanted structures from the image.

It is thus not an appropriate choice for this project. Moreover, it does not

help in detecting thylakoid membranes because of the difficulty to identify

thylakoid membranes from the background.

( a) (b)

Page 11: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

5

( c)

Figure 2: Histogram thresholding. (a)original image. Scale bar=100 nm (b)thresholded image t=125 (using Vigra) and (c)Histogram bar plot of original image (using Matlab)

EDGE BASED SEGMENTATION METHODS

Edges in an image are detected by making use of a combination of

methods such as convolution matrix- based operators and the Hough

transform. The methods section in this report (page 9) explains the

convolution operation in further detail. The Hough transform is a method

used to detect features such as lines, curves and ellipses, where the

desired feature can be expressed in a parametric form. The main advantage

of using the Hough transform is that it is relatively unaffected by either

the presence of gaps in the feature being identified or by the image noise

[4]. Thus, edge detection helps to identify boundaries of the primary

elements in an image. However, the presence of noise in an image may lead

to over fragmentation and some pale edges not being detected [4]. An

Page 12: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

6

example of a classic edge detector is Canny edge detector. Steps in this

algorithm include

• convolving image with Gaussian of scale σ,

• computing x and y gradients

• finding peaks in the image gradient

• performing a threshold operation to remove unwanted

responses.

Fig. 3a and Fig. 3b show the original image and edge image after applying

Canny edge detection to the real image respectively.

( a) (b)

Figure 3: Edge detection (a) original image. Scale bar = 100 nm (b) after applying Canny

edge detector σ = 3, gradient threshold = 3 (using Vigra)

REGION GROWING

Page 13: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

7

This is a very popular segmentation method and it gives good edge

detections in the case of noisy images. Initially the image is split to a large

number of smaller regions which are merged recursively into larger regions

based on different criteria such as homogeneity, or merging nearest

neighbors [6]. In this project, watershed segmentation, a special case of a

seeded - region growing, has been used to extract ribosomes from the

images. This method is explained in detail in the methods section (page

12).

In many of the low contrast, noisy images, one or more of these methods

need to be used to obtain good results. Also, in the case of noisy images,

performing preprocessing operations such as image smoothing before the

feature extraction process reduces the noise present in the image leading

to better results.

ELECTRON TOMOGRAPHY

Electron tomography is a method that helps determine the 3D structure of

cells and tissues at high resolution. Here, a series of images of an object

are taken at various tilt angles and they are combined together using the

back projection algorithm to produce a 3D structure of the object [17]. A

very useful feature of electron tomography is that it allows viewing the

structure in situ, which aids study of the structural arrangement. Electron

tomography is an important tool in studying macromolecules and cellular

complexes [11].

Page 14: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

8

One of the main issues in biological electron tomography is that the

images obtained have a low signal - to- noise ratio mainly due to complexity

of the biological specimen. For example, a cell can contain numerous

organelles of different shapes and sizes. Another issue arises from the fact

that electrons have the potential to damage the biological specimen. An

idea, thus, is to use an electron dose such that it provides adequate

contrast to study the specimen without damaging the structure [2].

Improvements in instrumentation, and using better techniques to improve

data quality, have led to production of tremendous amounts of data

making it very challenging to mine the vast amounts of data produced

using electron tomograms. Therefore, developing tools that can obtain

information with minimal human intervention is becoming essential.

DATA PREPARATION:

Serial thick sections of 100- 300 nm of the Synechocystis cell are cut and

post stained using uranyl acetate and Reynold's lead citrate. Colloidal gold

particles are attached to the surface of the section and it is viewed using

an electron microscope. Tomograms are produced by taking images at

every 1.5 o at a range of tilt angles from - 60 o to 60 o. IMOD software

developed at the University of Colorado, Boulder, tracks the position of the

gold particles on the section to align and merge the images to create a 3D

model [18].

DATA:

Page 15: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

9

IMOD is used to split the 3D model into a stack of two- dimensional (2D)

tiff images which serve as the input data for the program. Each two-

dimensional image is an RGB image of dimension 768*768 pixels and

having a size of 1.69 MB. The images contain inclusions such as ribosomes,

thylakoid membranes, and cytoplasmic filaments, along with various other

cell organelles. The task here is to extract the ribosomes, thylakoid

membranes and filaments.

Ribosomes are the prominent black structures, which appear to have a

circular shape in the image. Since they inherently have a 3D structure they

have varying diameters on a 2D image depending on which slice of the

section is being viewed. On an average, however, they have a diameter of

approximately 8- 10 pixels (Fig. 4).

Thylakoid membranes are curvilinear structures that are located nearer to

the cell wall having a width of approximately 12- 15 pixels (Fig. 5).

Filaments are also curvilinear structures which, however, appear

throughout most cell. They are of very small diameter and form a mesh

structure in the cell (Fig. 6).

Page 16: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

10

Figure 4: High magnification showing ribosomes. Scale bar = 50 nm

Figure 5: High magnification showing thylakoid membranes. Scale bar = 100 nm

Page 17: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

11

Figure 6: High magnification showing filaments. Scale bar = 50 nm

Page 18: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

12

METHODS:

A major task in feature extraction process is edge detection. Most edge

detection methods work based on the assumption that edges occur where

there is an intensity gradient in the image. The strength of an edge

depends on the speed at which the intensity changes, the faster the change

in intensity the stronger the edge. Therefore, one of the methods to detect

edges is to identify step discontinuities in the image [7]. Identifying the

discontinuities by finding local maxima or minima from the first

derivative, or by finding zero crossings from the second derivative of the

image, are the most commonly used methods for edge detection (Fig. 7).

Enhancement and smoothing operations applied to the image can make the

edges more noticeable.

Function representing an edge

1 st derivative

Zero Crossing

2 nd derivative

Figure 7: 1st and 2nd derivative of an edge illustrated in one dimension (http:/ /ari.cankaya.edu.tr/ ~reza /ImLab4.htm)

Page 19: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

13

First order differential method:

In a discrete image the gradient can be calculated by taking the difference

in the gray scale values of adjacent pixels [7].

The gradient of a two dimensional image I(x ,y) is given by the vector ∇Ι= [δ

Ι/δ x,δΙ/δ y] , which has magnitude √(δΙ/δ x) 2+(δΙ/δ y) 2 ) and

direction

tan - 1[(δΙ/δ y) / (δΙ/δ x)] .

One of the main issues associated with this method of edge detection is

the presence of noise. For most noise models large derivatives due to noise

are local events, however large derivatives of the signal can be present over

a larger area. This property can be used to reduce the noise. Thus, an

alternative to overcome this problem is to use an image smoothing

algorithm.

A two- step approach has been adapted for the images in the project:

• Convolve the image with a Gaussian mask,

Gs(x,y)= 1 exp[- (x2+y 2)/2 σ2]

2πσ 2

to smooth it.

• Calculate the derivative of the smoothed image.

Smoothing an image and then differentiating it is the same as convolving it

with derivative of a smoothing kernel [7].

Page 20: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

14

Figure 8: Synechocystis cell image after

Gaussian smoothing operation, σ =3

Figure 9: Synechocystis cell image gradient after

Gaussian smoothing operation, σ =3

Convolution:

Convolution operations are very useful for a large number of image

processing operations like image smoothing and image enhancement. It is

Page 21: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

15

a multi pixel operation where each output pixel is altered based on the

values of a set of the adjoining pixels, using a mask:

Figure 10a: Example image and kernel to illustrate convolution

Figure 10b:

Image

overlayed with kernel

The output pixel value is a linear combination of certain input pixel values.

The value of the bottom right pixel in the output image will be given by:

Kernel maskDigital Image

I11 I12 I13 I14 I15 I16 I17 I18 I19

I21 I22 I23 I24 I25 I26 I27 I28 I29

I31 I32 I33 I34 I35 I36 I37 I38 I39

I41 I42 I43 I44 I45 I46 I47 I48 I49

I51 I52 I53 I54 I55 I56 I57 I58 I59

I61 I62 I63 I64 I65 I66 I67 I68 I69

I11 I12 I13 I14 I15 I16 I17 I18 I19

I21 I22 I23 I24 I25 I26 I27 I28 I29

I31 I32 I33 I34 I35 I36 I37 I38 I39

I41 I42 I43 I44 I45

K1

1

I46

K12

I47

K13

I48

I49

I51 I52 I53 I54 I55

K2

1

I56

K22

I57

K23

I58

I59

I61 I62 I63 I64 I65

K3

1

I66

K32

I67

K33

I68

I69

K11 K12 K13

K21 K22 K23

K31 K32 K33

Page 22: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

16

O57= I46 K11 + I47 K12+ I48 K13 + I56 K21 + I57 K22 + I58 K23 + I66 K31+ I67 K32+I 68

K33.

Convolution is basically a process where each pixel is replaced with a

weighted average of values of the neighboring pixels as defined by the

mask [8]. Convolving with a Gaussian mask can help filter out the noise.

Gaussian Smoothing:

Smoothing using a Gaussian kernel is easy because of the following

properties:

- Most of the optimal filters have a Gaussian like profile because a

smoothing filter must place a stronger weight on the pixels in the

center of the filter and lesser weight on those that are distant.

- Convolving a Gaussian with a Gaussian results in a Gaussian:

Gs1** Gs2=G √(σ1)2

+(σ2)2

)

Because of this re- smoothing a smoothed image will still result in a

smoother image.

- Gaussian kernel is separable:

Gs(x,y)= 1 exp[- (x2+y 2)/2 σ2]

2πσ 2

= [ 1 exp (- x2/2 σ2) * 1 exp (y2/2 σ2) ] ,

√ 2πσ √ 2πσ

- A product of two 1D Gaussians [10]. Thus convolving with a 2D

Gaussian kernel is equivalent to convolving with two 1D kernels.

This is important because separable kernels are very helpful in

reducing the computation cost. Fig. 8 displays the resulting image

after performing Gaussian smoothing on Fig.1.

Seeded Region Growing:

Page 23: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

17

The step prior to seeded region growing segments the image to uniquely

labeled seed pixels and unlabeled pixels. The unlabeled pixels are then

assigned to closely matching seeded regions based on some fitness criteria.

Possible fitness functions are

- Fitness of the local gradients, so that the regions meet at the local

maxima

- The difference between the gray level of the candidate pixel and the

mean gray level of the region [1].

Watershed Segmentation:

This is a special case of seeded region growing where two neighboring

regions meet at local maxima. Here the pixels in the image are sorted

based on the grayscale value or on intensities. Local minima form the

catchment basins. This can be theoretically compared to a dam where

water starts filling in the basins and the water level rises. The point where

two catchment basins tend to intersect forms the boundary of the region

[1].

boundary

regions

Page 24: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

18

(a) (b)

catchment basins(local

minima)

Figure 11: Watershed segmentation. (a) Gray level gradient of image data (b) Displays local minima and watershed region in the image data

One of the disadvantages of this method is that it can lead to over

segmentation. Fig.11 helps in explaining this issue. Oversegmentation

occurs because every local minima however small, forms a catchment

basin. A solution here is to ignore the catchment basins which are too

shallow.

(a) (b)

Page 25: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

19

(c)

Figure 12: Watershed segmentation on Synechocystis cell. (a) Original image (b) Edge

image after performing watershed transformation (c) After region growing (enlarged

image)

Eigen value analysis of Hessian matrix:

The Hessian matrix is a matrix built from the second order partial

derivatives of the image [3] and contains information about the image

curvature, which is useful in identifying shapes in an image.

For a 2D image it is given as

H = Ixx Ixy Iyx Iyy where Ixy = ∂ 2 I , Ixx = ∂ 2 I and Iyy =

∂ 2 I . x y x ∂ ∂ ∂ 2 y ∂ 2

Note, Ixy = Iyx

For a 3D image it is given as

Page 26: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

20

Ixx Ixy IxzH = Iyx Iyy Iyz

Izx Izy Izz

Here I represents a volume function I(x, y, z) [3].

Convolving the image data with derivatives of Gaussian kernel can be used

to perform computation of H, but this is a computationally intensive

process. The complexity can be reduced by making use of separable

Gaussian kernels (page 16) [10].

Given H for each pixel in the image, eigen values and eigen vectors of each

H can be calculated. Two eigen values λ1 & λ2 are obtained for 2D images

and three eigen values λ1, λ2 and λ3 are obtained for a 3D image. The

eigen vector corresponding to the largest eigen value represents the

direction of the curvature. In addition, if λ1 & λ2 are big then that pixel

represents a corner.

Algorithms given in the next chapter makes use of these methods to

extract the inclusions ribosomes, thyalkoid membranes and filaments.

Page 27: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

21

ALGORITHM:

Ribosome segmentation:

An approach based on watershed transformation, a special case of a

seeded - region growing algorithm, has been used to segment ribosomes

which are the most prominent structures in the image.

Steps:

• Perform smooth operation with Gaussian kernel(σ =3)

• Find x and y components of the image gradient

• Transform components to gradient magnitude

• Find local minima of gradient magnitude

• Label the minima found

• Perform region growing using the minima points as seed points.

• Threshold the resulting image(t=125)

Figure 13 : Labeled region after watershed transformation

Page 28: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

22

Figure 14: Thresholded image after watershed segmentation (t=125)

Thylakoid membrane & filament segmentation:

In order to improve the contrast between the curvilinear structures

(thylakoid membranes or filaments) from the rest of the image, local

geometric properties of curvilinear structures have been exploited, based

on eigen value and eigen vector analysis of the Hessian matrix.

Steps:

• Calculate second order spatial derivative Ixx, Ixy, Iyx and Iyy of the

image I(x,y), where Ixy = Iyx, by convolving the image with

derivatives of a Gaussian.

• Determine the hessian matrix, H, at each pixel.

• Find the eigen values and eigen vectors of H.,

• Largest eigen value and its corresponding eigen vector indicate the

strength and direction of the curve.

• Filter the pixels which meet the curvature property.

Page 29: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

23

Figure 15: Extracted thylakoid membranes

TOOL DESCRIPTION:

The end product is an executable file which runs on a windows system via

cygwin[19].

Following tool(s) and libraries are essential for the application to function

effectively on a windows system

- Cygwin, provides a Linux- like environment for windows. It also

provides an x- window server which helps open the application

window [15].

- VIGRA computer vision Library, version 1.2.0 implemented by Ullrich

Köthe [12]

- Linear algebra library newmat, version 11, implemented by Robert

Davies [13]

The user interface for the tool was developed using Qt, a cross platform

development tool. Therefore, the source code should compile on UNIX

systems too but has not been tested.

The user interface provides menu options to Open, Close and Save Files.

Another menu item is Segment, which provides an option to segment

Page 30: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

24

ribosomes, thylakoid membranes and filaments. In addition, a dialog box is

provided in which the user can set parameters such as the threshold if

they are not satisfied with the default results obtained.

Figure 16a: Tool user interface - Displays dialog box to set parameters

Figure 16b: Tool user interface - Window displaying opened image file

Page 31: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

25

The tool also provides tool tips and shortcut keys, which helps the end

user in using the menu options easily.

Page 32: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

26

IMPLEMENTATION:

Three programming languages were considered for this tool development.

- Matlab

- Python

- C++

A semi- automatic feature extraction tool was implemented initially in

Matlab. Matlab was chosen because it is a matrix oriented programming

language and it provides an image processing toolbox, which could reduce

the development time significantly. However, licensing Matlab is expensive;

moreover usage of control loops slows down the program significantly,

therefore, I decided to use Python.

Python is an interpretive object oriented scripting language[9]. It is a

programming language, which is freely available, and it provides many

extension modules like PIL (python imaging library) and

Numeric/numarray, which makes it a very desirable tool. Therefore, I used

Python to develop the code and the SDC morphology toolbox for the

morphological operations such as open, close, erode and dilate. However,

processing large images (of size 768*768 pixels) was computationally

intensive since Python is an interpretive language. For example, analysis

based on Hessian matrix to identify curvilinear structures took

approximately 10- 15 minutes per image on a normal Pentium 4 PC. The

solution here was to write parts of computation intensive code in C.

Qt a tool developed by Trolltech[14], was chosen to develop the user

interface for the program. Qt is a C++ application development

framework, which includes a class library and tools for cross - platform

development [14]. In addition, it is a sophisticated toolkit that is very

helpful in development of efficient user interfaces. However interfacing

Python with Qt was not very easy. Therefore, the entire code was ported to

C++ using libraries Vigra and Newmat. Both the libraries are freely

Page 33: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

27

available, and both make use of the C++ standard template library, making

it generic and easy to use.

The final version of the project is written using C++. An image processing

library Vigra 1.2.0 has been used to provide the necessary image

processing functions and a library Newmat has been used to provide the

necessary Linear algebra support for the tool. On windows operating

system the tool is executed using cygwin, which provides a Unix like

environment on windows.

Page 34: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

28

RESULTS

(a) (b)

Figure 17: (a) Original Synechocystis cell image. Scale bar = 100 nm (b) Extracted

Ribosomes

Figure 18a: Manually segmented model of a zoomed- in portion of the image (white circles represent ribosomes)

Page 35: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

29

Figure 18b: Ribosomes detected using the tool overlayed on the real image

Fig. 17a and Fig. 17b show Synechocystis cell image and ribosomes

segmented from the image using the tool respectively. Verifying the

authenticity, by overlaying one over other using Adobe Photoshop, showed

that large percent of ribosomes present in the cell were correctly

identified. This fact was further confirmed by comparing the ribosomes

identified by the tool against manually segmented images as shown in Fig.

18a and Fig. 18b.

The table given below provides test results for some of the images tested

using the program. Various reasons are responsible for the difference in

the number of manually segmented ribosomes and those detected by the

program. The reasons could be classified into one of these categories:

Page 36: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

30

- False Positives, result from wrongly identifying features, which are

not ribosomes, but have similar characteristics, such as intensity

value, like that of ribosomes in the image. Another reason could be

that they are part of ribosomes, which have not been identified in

the current slice, and will be observed in the subsequent slices.

- False Negatives

File name Number from manual segmentation

Number correctly identified by the program

False Positives

False Negatives

Zap010.tif(892 x 759 Pixels)

35 33 14 2

Slice1.tif(682 x 682 Pixels)

181 137 2 44

Slice2.tif(682 x 682 Pixels)

70 66 17 4

The tool was tested against many other slices and compared visually,

similar results were obtained.

Page 37: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

31

(j) (j)

Figure 19: (a) Original Synechocystis cell image and (b) Extracted thylakoid membranes

overlayed on the original image

Fig. 16a and Fig. 16b show Synechocystis cell image and thylakoid

membranes identified by the tool superimposed over the original image,

respectively. As observed from the images most of the curvilinear

structures representing thylakoid membranes have been identified.

However some of the segments inside the cell have been wrongly

identified. In addition, there are discontinuities in some areas were the

structure is slightly faded. The algorithm needs to be modified to

accommodate this issue.

Figure (19)a and (19)b shows the segmentation of filaments. More work

needs to be done for filament extraction.

Page 38: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

32

(k)

(b)

Figure 20: (a) Original Synechocystis cell image and (b) Extracted filaments overlayed

on the original image

Page 39: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

33

CONCLUSIONS AND FUTURE WORK

This project focuses on ways to extract features from Synechocystis

bacterial cells. The dataset used here are tomograms of the bacterial cell

taken using an Electron Microscope. Two different approaches have been

used to detect the presence of ribosomes, thylakoid membranes and

filaments. A method based on watershed segmentation, which is a special

case of a region growing algorithm, has been used to detect ribosomes. A

comparison of segmentation results produced by the program against

results obtained from manual segmentation shows that the program has

identified a large percentage of the manually detected ribosomes.

Nevertheless, watershed segmentation is not very useful in identifying

curvilinear structures like thylakoid membranes and filaments because

these structures are of low contrast with respect to the background and it

is difficult to identify the structures from its boundaries. Therefore,

geometric properties of the curvilinear structures have been used to

extract them from the surroundings. However, while considering 2D

images there are areas where the curvilinear structure tends to fade out, in

such cases a 3D view might be of more help.

The program is capable of handling different file formats like jpg, png, tiff

depending on the availability of the corresponding libraries to read/write

these file formats. This program, however, has been tested only with tiff

image files. In addition, it has been tested only with 3- 4 files at one point

of time due to memory limitations.

The algorithms are very flexible and easily extendable to support other

requirements such as extracting other features or using for similar

applications. In addition, the source code for the image- processing library

and linear algebra library are available and they can be easily modified to

support additional requirements that are currently not provided by the

library.

Page 40: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

34

This project addresses only segmenting ribosomes, thylakoid membranes

and filaments. Future research work should focus on identifying other

inclusions present in the cell besides these features. Developing a full-

fledged version of the program can significantly cut down the time

biologists spend on manual segmentation. One of the methods, which

could be used to detect other features present in the cell, will be to

perform a hierarchical segmentation by masking the segments that have

already been identified and try to extract new ones. More research needs to

be performed to eliminate False Positives as much as possible in the case

of filaments. Providing an option in which the user can edit the segmented

image and manually correct the errors would be very useful.

Page 41: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

35

References

1) Ullrich Köthe , 1995. Primary Image Segmentation. Proc.17. DAGM-Symposium, Springer 1995

2) A. Bartesaghi, G. Sapiro, S. Lee, J. Lefman, and S. Subramania. 2003. A new approach for 3D segmentation of cellular tomograms obtained using three - dimensional electron microscopy. Institute for Mathematics and its Applications, December 2003 Preprints #1950

3) Adam Huang. 2003. Three- Dimensional Biomedical Image Segmentation And Visualization: A Shape- Based Approach. Ph.D. thesis. Arizona State University.

4) David Neary. 2000. Fractal methods in Image Analysis and Coding. Ph.D. thesis. Dublin City University School of Electronic Engineering.

5) D. Ballard and C. Brown. 1982. Computer Vision. Prentice- Hall.

6) Rolf Adams, Leanne Bischof. 1994.Seeded region growing. IEEE Trans. on PAMI. Vol.16: 641 - 647

7) D. Marr and E. C. Hildreth. 1980. Theory of edge detection. Proceedings of the Royal Society, London B. 207:187- 217

8) Rafael C. Gonzalez and Richard E. Woods. 1992. Digital Image Processing. Addison- Wesley Publishing Company.

9) Alex Martelli . 2003. Python in a Nutshell.

10)Karank. 2002. Edge detection. chapter 9.

11)A.J. Koster and J. Klumperman. 2003. Electron microscopy in cell biology: an integrated view on structure and function. Supplement Nature Reviews Molecular Cell Biology 4:SS6–SS10.

12)Vigra. Home page. <http: / / kogs - www.informatik.uni -hamburg.de / ~ koethe /vigra / >

13)Davies, Robert. Home page. <http: / /www.robertnz.net / index.html>

14)Qt. Home page. <www.trolltech.com/ >

15)Cygwin. Home page. <http: / /www.cygwin.com / >

Page 42: Synechocystis sp. PCC 6803 CELL IMAGES Committee …cbs/projects/2004_report_ko... ·  · 2008-09-25Synechocystis sp. PCC 6803 CELL IMAGES by Shylaja Kokoori ... Chapter 1 GOALS

36

16)Dr.Robert Roberson, E- mail to author, 16 April, 2004.17)O'Toole, E.T., Winey, M. J., McIntosh, J.R., Mastronarde, D.N. 2002.

Electron Tomography of Yeast Cells. Meth. Enzymol. 351: 81- 95

18)Kaneko T, Tabata S., Whitmarsh J. 1997. Complete genome structure of the unicellular cyanobacterium Synechocystis sp. PCC6803. Plant Cell Physiol. 38(11):1171- 6.

19)Flaubert Mbeunkui. 2003. The effects of low nitrate levels on the freshwater cyanobacterium Synechocystis sp. strain PCC 6803: Construction of a bioreporter assay and molecular characterization by transcriptome and proteome analysis. Ph.D. thesis. University of Stuttgart.