Ran's Thoughts: 2010

Wednesday, October 13, 2010

RoiView - Explore InSAR data and more

RoiView was originaly designed to replace DGX/MDX software which require bureaucracy to obtain. This software was used in order to view ROI-PAC interferometry processing results but producing an image from file was not so strait forward. In time support was added for ERDAS ER Mapper header files and for user provided parameters with or without header files.

Current RoiView version (0.75) is available for download at

See also RoiView's help files

Screenshot:

Tuesday, September 28, 2010

HowTo Parse HTML text using python.

I needed to to get the text of a specific div elemnt in an html file. I tried to use python's standard library modules for markup processing such as htmllib etc. but I couldn't figure how to use them. I've created my own module just for getting the text from an html element:

#!/bin/env python
from htmlentitydefs import entitydefs as ent
import string
# This module enable you to extract text from a certian HTML element
#
# by Ran Novitsky Nof, 2010
# ran.nof@gmail.com
#
# example of use:
# say we want to get the text in an element of type tag (e.g. 'div','a','span' etc.)
# who has an attribute key (e.g. "id","class","href" etc.) with a value of val
# for example in order to extract the text of a div element with id of textdiv from a file htmlfile.html:
#
# <html>
# <head>
#   :
# </head>
# <body>
#   :
# <div "id"="textdiv">I will not buy this <a href="spam">record</a> it is scratched. </div>
#   :
# </body>
# </html>
#
# use:
# from htmlparser import Parser
# htmlfile='htmlfile.html'
# tag,key,val = ('div','id','textdiv')
# text=Parser(htmlfile).getText(tag,key,val)
# print(text)
#
class Element():
def __init__(self):
    self.startTag = -1
    self.endTag = -1
    self.attrib = {}
    self.keys = self.attrib.keys()
    self.innerHTML = ''
    self.tag = ''
    self.start = -1
    self.end = -1

class Parser():
def __init__(self,infile):
    self._root = open(infile).read()
    self._root = self._root[self._root.find('<body'):self._root.find('</body')]
    self._root = self._root[self._root.find('>')+1:]
    self.tags = set()
    i,j=0,0
    self.tagstarts = {}
    self.tagends = {}
    self.elements = []
    while i<len(self._root):
      i = self._root.find('<',i)
      j = self._root.find('>',i)
      if not j>i: break
      tag = self._root[i+1:j].split()[0]
      if tag.startswith('/'):
        tag = tag[1:]
        self.tagends[tag][-1-self.tagends[tag][::-1].index(None)]=((i,j))
      else:
        self.tags.add(tag)
        if tag in self.tagstarts:
          self.tagstarts[tag].append((i,j))
          self.tagends[tag].append(None)
        else:
          self.tagstarts[tag]=[(i,j)]
          self.tagends[tag]=[None]
      i=j+1
    self.getElements()
def getElements(self):
    for tag in self.tags:
      if not tag in ['img']:
        for i in range(len(self.tagstarts[tag])):
          element = Element()
          element.startTag = self.tagstarts[tag][i][0]
          element.endTag = self.tagstarts[tag][i][1]
          tagData = self._root[element.startTag+1:element.endTag].replace("\"","")
          element.tag = tag
          element.attrib=dict([a.split('=') for a in tagData.split() if '=' in a])
          element.start = element.startTag
          element.end = self.tagends[tag][i][1]
          element.innerHTML = (self.tagstarts[tag][i][1]+1,self.tagends[tag][i][0])
          self.elements.append(element)
def getText(self,tag,key,val):
    element = [element for element in self.elements if element.tag==tag and element.attrib[key]==val]
    if len(element):
      element = element[0]
      start,end = element.innerHTML
      text = self._root[start:end]
      i,j=0,0
      while i<len(text):
        i = text.find('<',i)
        j = text.find('>',i)
        if i<0 or j<0: break
        text = text[:i]+text[j+1:]
      i,j=-1,-1
      while i<len(text):
        i = text.find('&',i+1)
        j = text.find(';',i+1)
        if i<0 or j<0: break
        if text[i+1:j] in ent.keys(): text = text[:i]+ent[text[i+1:j]]+text[j+1:]
    else:
      text=None
    return text

Note the code does not check if the html code is correct. it also work only for the body part, and ignore img tags.

Sunday, July 18, 2010

SSH with no password

When you wish to login to a remote system through an ssh session without using any password you'll need to produce a key on the local computer and on the remote computer in order to allow password - free entrance.
first you'll need to produce the key in the local computer:
ssh-keygen -t dsa -b 1024 -P ""
press Enter on the questions followed by the command.
the output will be something like:
Your identification has been saved in /home/yourname/.ssh/id_dsa.
Your public key has been saved in /home/yourname/.ssh/id_dsa.pub.
The key fingerprint is:
4f:9e:4c:17:8f:19:37:32:bc:45:4f:2a:a2:8e:1c:15 yourname@your.loacl.machine
The key's randomart image is:
+--[ DSA 1024]----+
|           .o    |
|           .oo. |
|          . o+   |
|      o    *.+. .|
|       .E . + . |
|      o .+.. . .|
|       +o.o. + |
|       o+E       |
|       o=.       |
+-----------------+

This will produce two files in you .ssh directory: id_dsa and id_dsa.pub

cat id_dsa.pub into ~/.ssh/authorized_keys on the remote system you want to log into.
change mod to 600:
chmod 600 ~/.ssh/authorized_keys

(source of the tip)

Friday, June 25, 2010

Howto export points to a Google Earth KML file

The problem:
A table of xyz data points can be viewed spatially with different methods:
GMT and ArcView are just a couple of options. I often get tables of spatial data points in Microsoft Excel (xp ver.) format or CSV of some kind. An easy (and free) way to visualize the data points on Windows would be to export the data and view it on Google Earth.
The solution:
points2kml (115kb).
In order to ease the work-flow I wrote a VBA macro add-in for Excel that exports selected fields (longitude, latitude, name and data) to a Google Earth KML file. the point2kml add-in and can be found on my scripts page.

Wednesday, June 2, 2010

How to format a hard drive on Linux

The Problem:
How to format a hard drive on Linux.

The Solution:
in order to format a HD on Linux you'll need super user privileges.
you can use sudo if you have sudo privilages or in the terminal
switch to super user:

>su
(give super user pwd)
and ignore the sudo in the following commands. if you're system can't find the commands while in su mode, try to add /sbin/ before the command.
follow the steps below: marked with •
run commnads marked with >
commands output looks like this.
Warning: formatting your HD will erase all data on it!

Connect the HD to the computer, make sure it's on.
get the hard drive's (device) name on your system:

>df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/mapper/vg_insarlablin-lv_root

                      908G   77G  786G   9% /

/dev/sda1             194M   31M  153M  17% /boot

/dev/sdb1             1.4T  1.1T  193G  86% /home/novitsky

/dev/sdc1             1.4T  1.1T  193G  86% /home/novitsky/data

Now lets say our HD is sdb1.

lets build it's partition:

>sudo fdisk /dev/sdb

The number of cylinders for this disk is set to 182401.

There is nothing wrong with that, but this is larger than 1024,

and could in certain setups cause problems with:

1) software that runs at boot time (e.g., old versions of LILO)

2) booting and partitioning software from other OSs

   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help):

pressing m will get this menu:

Command action

   a   toggle a bootable flag

   b   edit bsd disklabel

   c   toggle the dos compatibility flag

   d   delete a partition

   l   list known partition types

   m   print this menu

   n   add a new partition

   o   create a new empty DOS partition table

   p   print the partition table

   q   quit without saving changes

   s   create a new empty Sun disklabel

   t   change a partition's system id

   u   change display/entry units

   v   verify the partition table

   w   write table to disk and exit

   x   extra functionality (experts only)

pressing p will print the current partition:

Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes

255 heads, 63 sectors/track, 182401 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk identifier: 0x000923e6

   Device Boot      Start         End      Blocks   Id  System

/dev/sdb1               1      182401  1465136001   83  Linux

pressing d will delete the current partition. depending on the partition you want to delete (this case 1) press the number of the partition to be deleted. if all the disk should be formatted and there are several partitions repeat pressing d and the number of the partitions until you delete them all.
pressing n will create a new partition.
pressing p will create a primary partition. select 1 as the number of the partition. use the defaults (just press Enter) for the start and end cylinders.
pressing t will set the file system type.
pressing l will list the available file systems.
select 83 if you want a Linux file system
pressing w will write and exit to shell.

Now we will create a new file system on the new partition:
> sudo mkfs -t ext2 /dev/hdb1
remember to change the number at the end if your HD device number is different.
if your using different fs than ext2 (newer Linux systems have ext3 and fc11 has ext4) replace the ext2 in the command with the correct fc name.
the mkfs may take some time to run (depending on the computer and HD size).

To add a label to the partition use:
e2label /dev/sdb1 NEWLABEL
where sdb should be changed to the correct device and NEWLABEL changed to the new label.

Now add the device to the fstab file in order to mount it:
> sudo vi /etc/fstab
and add a line like:
LABEL=NEWLABEL MOUNTPOINT ext3 defaults 1 2
where NEWLABEL is the disk labe and MOUNTPOINT is where you want to mount the disk.
mount it using:
> sudo mount -a
Note you might need to change permissions and owner of the mount point:
> sudo chown username:usergroop MOUNTPOINT
when finnished - reboot.

Wednesday, April 28, 2010

Moving memory from swap (page) to RAM

When using more memory than your RAM Linux will use swap which is placed on your hard drive and is much slower. Sometimes after using swap there might be a situation where the RAM is free but some of your data is on the swap. There is a way to relocate the swap data to the RAM.
more details and a nice script to do is are available at: Ubuntu Community Documentation Swap FAQ page.

sudo swapoff -a

sudo swapon -a

The commands simply shut off swap and turning it on back again.
You'll have to make sure there is enough space on the RAM to accommodate the Swap data, this can be done using:

free

the output will show you the used space of Swap and free space of RAM:

             total       used       free     shared    buffers     cached
Mem:       8047896     772268    7275628          0      19540     129372
-/+ buffers/cache:     623356    7424540
Swap:     10108920          0   10108920

The example above show that no data (0) is stored on Swap.

Wednesday, April 21, 2010

Using hillshade image as intensity (improved matplotlib shade)

Matplotlib module enables hillshade method (shade) using a LightSource class (v 0.99).
The problem is it uses the data itself as intensity and data. It is very useful for viewing a DEM but sometimes you would like the DEM as intensity underlying some other data. Another problem is that the shade method is producing a very light colored image sometimes even white where intensity is high.
I used as an example a DEM derived from SRTM v4 data acquired at the International Centre for Tropical Agriculture (CIAT - http://srtm.csi.cgiar.org) the hillshade production was made using LightSource class with azimuth of 165 deg. and altitude of 45 deg.)


DEM - gist_earth color scheme		Hill-shade (azdeg-165,altdeg-45)

matplotlib shade method		My shade method

The difference in the shading colors derived from the method used to produce it. While the matplotlib method uses "hard light" method I use a "soft light" method. the matplotlib is converting the RGB colors to HSV and then calculate the new saturation and value according to the intensity. I use a formula based on the description of ImageMagick's pegtop_light.which is much faster as it is a single formula. Another advantage is the option to use a separate layer as the intensity and another as the data used for colors.
The modified functions are hillshade and set_shade as follows:

#!/bin/env python
from pylab import *
def set_shade(a,intensity=None,cmap=cm.jet,scale=10.0,azdeg=165.0,altdeg=45.0):
''' sets shading for data array based on intensity layer
or the data's value itself.
inputs:
a - a 2-d array or masked array
intensity - a 2-d array of same size as a (no chack on that)
                    representing the intensity layer. if none is given
                    the data itself is used after getting the hillshade values
                    see hillshade for more details.
cmap - a colormap (e.g matplotlib.colors.LinearSegmentedColormap
              instance)
scale,azdeg,altdeg - parameters for hilshade function see there for
              more details
output:
rgb - an rgb set of the Pegtop soft light composition of the data and
           intensity can be used as input for imshow()
based on ImageMagick's Pegtop_light:
http://www.imagemagick.org/Usage/compose/#pegtoplight'''
if intensity is None:
# hilshading the data
    intensity = hillshade(a,scale=10.0,azdeg=165.0,altdeg=45.0)
else:
# or normalize the intensity
    intensity = (intensity - intensity.min())/(intensity.max() - intensity.min())
# get rgb of normalized data based on cmap
rgb = cmap((a-a.min())/float(a.max()-a.min()))[:,:,:3]
# form an rgb eqvivalent of intensity
d = intensity.repeat(3).reshape(rgb.shape)
# simulate illumination based on pegtop algorithm.
rgb = 2*d*rgb+(rgb**2)*(1-2*d)
return rgb

def hillshade(data,scale=10.0,azdeg=165.0,altdeg=45.0):
''' convert data to hillshade based on matplotlib.colors.LightSource class.
    input:
         data - a 2-d array of data
         scale - scaling value of the data. higher number = lower gradient
         azdeg - where the light comes from: 0 south ; 90 east ; 180 north ;
                      270 west
         altdeg - where the light comes from: 0 horison ; 90 zenith
    output: a 2-d array of normalized hilshade
'''
# convert alt, az to radians
az = azdeg*pi/180.0
alt = altdeg*pi/180.0
# gradient in x and y directions
dx, dy = gradient(data/float(scale))
slope = 0.5*pi - arctan(hypot(dx, dy))
aspect = arctan2(dx, dy)
intensity = sin(alt)*sin(slope) + cos(alt)*cos(slope)*cos(-az - aspect - 0.5*pi)
intensity = (intensity - intensity.min())/(intensity.max() - intensity.min())
return intensity

Example of use:
One can save the code to a file named say: shading.py
now say we have a 4 byte float DEM data in a 560 lines 420 samples binary file. in a python code:

from pylab import *
from shading import set_shade
from shading import hillshade
dem = fromfile('DEM.dem',dtype=float32).reshape(560,420)
rgb = set_shade(dem,cmap=cm.gist_earth)
imshow(rgb)

will produce the "my shade method" image as above.

say we have a data to be plot using the DEM data as intensity:

replace the line before last with:
rgb = set_shade(data,intensity=hillshade(dem),cmap=cm.gist_earth)