Turning Hardcopy Data into Idrisi Files
By Peter Erickson and Ali Macalady
Revised by David Bitner and Josh Brandt
Winter 1997
There are two main choices for turning hardcopy maps, aerial photos, etc. into computer files, and the decision on which one to use is partly a matter of preference but largely a matter of practicality. If you only need a few key features from, say, a USGS quad, it might be easiest to digitize them. On the other hand, you could trace what you need, and scan that in, or, if you are really ambitious, scan the actual map in. There are many good references for deciding which to use, but after you get a feel for each method, you’ll be able to see the pros and cons of each.
We designed the following tutorial using the Morro Bay North Quad you are all so familiar with. However, the general procedure is the same for importing any type of hardcopy data.
I. SCANNING AN IMAGE
Where is the scanner?
The scanner is in Mudd 64 connected to a Mac Quadra called Trilobite. Images are placed face down in the scanner, and the lid closed. The scanner can only be used with Trilobite, so after you scan your image in (this will only take a few minutes), we suggest you transfer it to another computer. To do this, you can either use a disk (many images won’t fit, however) or select another geo dept. computer from either the chooser or ’Other Computers’ under the colorful apple menu, located in the upper left corner of the screen. When the other computer’s icon appears on the desktop, drag your image’s icon into its icon.
Selecting what you want to scan
First, choose an area of interest in the Morro Bay quad, making sure it contains at least 4 points where you know the exact coordinates (latitude and longitude or UTM). For this tutorial, smaller will be better than larger. It’s important to have many control points (20 or so is ideal), especially ones near the perimeter of the area you ultimately want in Idrisi. Since on the quad the only points you can really know are the ones that are tick-marked, this somewhat limits your selection. However, if you already had a geo-referenced satellite image of Morro Bay in Idrisi, then you could pick your control points as anything that remains constant through time that you could recognize on both the scanned image and the satellite image in Idrisi. Good points to pick in this case are mountain summits, road intersections, and bridges. The basic idea is whatever points you pick, you need to have some way (either in the hardcopy map you are scanning or on some digital geo-referenced image) of knowing the EXACT coordinates of several control points across your image to be scanned, or, as you will see later, digitized.
How do I scan?
For this exercise, you will trace some features (roads, contour lines, whatever you want) from the USGS quad and scan that traced image. You could also just scan the quad as is, but in the case that you only want a certain amount of information, a trace of only those features you really care about is much easier to work with. So, the first step is to trace (dark pencil or pen is best) onto mylar or tracing paper the features you want and several control points. Take note of the coordinates of each point, either lat-long coordinates (unfortunately you will have to convert them to decimal degrees) or UTM coordinates, noted with the blue tick marks. You can interpolate between tick marks for control points if you wish, but make note of your relative confidence in your control points for use later. As noted on the map, a UTM marking of 703 means a UTM coordinate of 703,000 m. UTM (not lat-long) are standard for many types of digital data, but take your pick. We used lat-long because it was more familiar to us, but in many cases outside this tutorial it might be better to pick UTM, especially if the other data you are using is in UTM (coordinate transformations in Idrisi are often a pain, so it’s best to avoid them). Then, place this face down in the scanner.
Open Deskscan II under the colorful apple in the corner. Click on the ’Preview’ button. On the screen will appear the whole field of view of the scanner. Use the cursor to window out what you traced, and click ’Zoom’. Now is the time to worry about what you want the image ’Type’ and ’Path’ to be. For scanning nice, dark traces, we found that the best combination was Type: Sharp B and W Drawing and Path: Laserwriter 300 dpi. For most GIS use, Paths such as Laserwriter 600 dpi or Adobe Photoshop seem to produce images which take too much memory and have a higher resolution than is even useful for working with satellite imagery. In any case, feel free to play around with the Type and Path, being aware of memory and resolution. You can also play with the contrast and levels, but Deskscan usually picks some reasonable levels all by itself.
Now, once you have windowed the image and made all your Path and Type selections, click on ’Final’. Save the image as TIFF format.
How do I clean up my scanned image and export it to Idrisi?
Your old friend Adobe Photoshop will come in handy here. Open Photoshop and open the scanned image. Depending on how the scan turned out, you may want to erase smudges or extra lines or even rotate the image so it’s right-side up. When you are happy with the image, select ’Save As’ and pick a name that a DOS/Windows computer will like: that is, make sure your title is 8 or fewer characters followed by a ’.tif’ tag (to tell the computer it is a TIFF image). Save as TIFF, and when it asks for Byte Order, select IBM PC (this is very important, and why you always have to take your image through Photoshop, even if you don’t want to make any changes.)
Now, to get the file to the Gateway 2000 computers in the geolounge, you have to put your file on the Etienne Geo Users folder. Open Etienne with the chooser or under ’Other Computers’ under the colorful apple, and drag your file’s icon into Geo Users.
You are now through with those silly Macs. Go to one of the Gateways in the geolounge. On these computers, Etienne is known as the K: drive. In Windows, open Idrisi by clicking on Idrisi for Windows. Set the Environment to be K:/geousers.
How do I run the conversion to make my TIFF file an Idrisi image?
Under File->Import->Desktop Publishing Formats select TIFIDRIS. In the resulting window, type in your scanned file or double-click on the TIFF window and select it that way. Make a name for the resulting image (don’t attach any tag, Idrisi will do that for you), and click OK. The computer will crank away for awhile, and when it’s done, you should be able to display the scanned image (make sure you use the palette called bw)!!!
How do I geo-reference my scanned image?
Wooo-eee, this is the fun part. The Idrisi module for georeferencing is called Resample, and it’s located under Reformat. First, though, we need to create what is called a ’correspondence file’ to tell Idrisi which points are our control points and what we want their coordinates to be.
When you opened up your file, you may have noticed that Idrisi assigned it coordinates based on the number of rows and columns. We want to ’ReSample’ the image so that it has the correct (Lat-long or otherwise) coordinates. So, for each of your control points, use the window tool (icon: blue rectangles) to zoom in really close on each point. Position the cursor exactly at the center of the control point, and write down the x and y coordinates displayed at the bottom of the screen. Once you have done this, go to Data Entry->Edit, select ’Correspondence File’ and give the file a name you will recognize as being the correspondence file.
This first number you want to enter is the number of control points you will use. Then, following, you want to enter the ’old’ coordinates (the ones you took from the bottom of the screen) of the control point in the form x-old y-old followed by the new coordinates (the ones you took from the original map) in the form x-new y-new. Our correspondence file looked like this:
4
31.000 51.000 -120.8743 35.4166
1881.000 84.000 -120.8333 35.4166
50.000 98.000 -120.8333 35.3749
105.000 141.000 -120.8743 35.3749
Note the negative signs on the new x coordinates of the lat-long pairs. Because we are in the western hemisphere, we need to tack a negative sign on so that coordinates will get bigger as we go from west to east. Otherwise the computer gets mad. Note also the lack of commas. I sat swearing at the computer for some time when it wouldn’t do the Resample until I noticed the problem was the commas I had entered. Finally, note that the old coordinates end in zeros. I did this for simplicity, but you probably want to enter all the digits Idrisi gave you. Save the file and exit.
One Caution before we begin Resample. Depending (I think) on how big your file is, it may have been converted by Idrisi as Packed Binary instead of Binary. Resample won’t take files which are in Packed Binary, so check your data type using File->Describe. If it says ’Packed’, then under Reformat->Convert, run a conversion where the output data type is Byte and the output file type is Binary. You can leave the same name for the file. Presto!
Now, ready for the Resample. Open Resample and notice the complicated window. We’ll go through each box one by one. First, make sure you select an Image file. Later, if we have to Resample a digitizer file, we will want to select vector here. Select your input file (remember, you can always click on the box for the list of files) and your correspondence file, and give the output file a name. Select the Reference System you want the file to be in, in this case probably Lat-Long. Leave the Reference Units in Meters, the Unit Distance as 1, the Background Value as 0, and the Mapping function as Linear. For the Resampling type, the Idrisi manual recommends ’nearest neighbor’ for files with data values that cannot be changed such as categorical or qualitative data like soils types. It recommends ’bilinear interpolation’ for quantitative data such as remotely sensed imagery. Since our image is just 0’s and 1’s, where the 1’s represent what we are interested in and the 0’s represent everything else, we have qualitative data and should select nearest neighbor.
For the min/max x and min/max y boxes, enter the coordinates of what you want the final image to have. If you are going to be using this image with another Idrisi image, such as a Morro Bay satellite image, you want to specify these coordinates to be exactly the same as the extent of the other image. (To learn anything and everything about a file or image, select File->Describe and the file.) Similarly, set the rows and columns to match that of the image you want to match. Since in this case we are not matching to any pre-existing image, choose the coordinates to be approximately the entire area of your scan, and the rows and columns to be: #Columns= (MaxX-MinX)/Resolution, #Rows=(MaxY-MinY)/Resolution. Since we are not dealing with resolutions specified by the type of remotely sensed data, choose this to be whatever you wish such that you have a few hundred rows and a few hundred columns.
Finally, click OK. Idrisi now gives you a box showing the root mean square (RMS) error and each of the individual residual errors. A high RMS for a point means the point’s coordinates were ill-chosen, that is, Idrisi couldn’t find a transformation that fit that point very well. The total RMS indicates the positional error of all the control points in relation to the transformation equation. According to US national map accuracy standards, the RMS should be less than 1/2 the resolution of the input image. For now, don’t worry too much about the RMS.
So the transformation, Dr. Frankenstein, is complete. What Idrisi has done is taken the control points you gave it, and then warped (or rubbersheeted) the image to fit those points. If it can accommodate all the points, you get a low RMS. A high RMS usually means that the coordinates you specified weren’t totally exact, and so that’s why the RMS window that popped up allowed you to drop points with a high residual. In any case, if you display the image, you’ll see that it now has lat-long coordinates. If you were to use this image with another image, you would want to use the Describe function to make sure that they had exactly the same rows, columns, and coordinates. If they didn’t (sometimes they differ in the ten-thousandths place by some random error) you can use File->Document to make sure they agree perfectly.
II. The Digitizer
Where is the digitizer?
The digitizer is connected to the Gateway 2000 closest to the window in the GeoLounge. It has a flat grey tablet and a puck (mouse-like thing with multiple buttons). The digitizer works by having many wires just under the surface of the board which sense the exact location of the puck’s crosshairs. Please don’t pile books and stuff on the digitizing tablet, as it’s not very strong.
What software do I use to digitize?
The software used with the digitizing tablet is called TOSCA. It was designed by the same people who made IDRISI and so it is easy to use the two programs in conjunction with one another. An inconvenient aspect of TOSCA is that it operates only in DOS and requires a considerable amount of memory. This means that you have to do some manipulating to get TOSCA to load. The first step is to turn off the network on the computer in order to free up memory. Sean Fox created a little program in DOS which simplifies things a bit-run it as follows:
Exit Windows and when in DOS change the directory by typing C: at the K:\ prompt. Then type TOS to run Sean’s program. Now reboot by pressing Ctrl-Alt-Delete simultaneously. While rebooting, press Ctrl. X to keep the computer from automatically loading Windows. You should now be in DOS and you should see the C:\ prompt on the screen. To re-activate the network when you are done using TOSCA, type UNTOS at any prompt in DOS. Then reboot again. Once you have turned off the network, you can load TOSCA. At the C:\ prompt, change the directory to IDRISI by typing ’cd IDRISI’. Then type TOSCA and you should find yourself in the Main Menu of the application.
How do I get started digitizing?
We designed this little tutorial to show you the basics of scanning and digitizing and then referencing both types of information into IDRISI. This should give you an understanding of how the digitizer works and an idea as to how you might be able to employ the digitizer for you own purposes later in the course. Much of the following material comes almost directly from the TOSCA manual.
I will be assuming in the following instructions that you have at least read the scanning portion of this tutorial. Tape the Morro Bay Quad map securely to the digitizer, making sure that the area you choose is completely within the active region of the tablet (dark grey rectangle). Each time you begin using TOSCA you must run a test configuration to make certain that the puck is accurately relaying information to TOSCA. To do this, enter into the Digitize Menu from the Main Menu and then choose ’test config’. If everything is running smoothly, all the headings in the first reading should be ’0’. The rest of the test is pretty self-explanatory. If something isn’t working correctly, you can refer to one of us (Pete and Ali) or to the Configuring Digitizers section of manual. If you feel like tinkering and solving the problem yourself using the manual (be careful, for your and everyone else’s sake), most of the configuration happens in a separate application of TOSCA called TOSCANEL. You can load this program by typing its name at the C:\IDRISI prompt in DOS.
DEFINE FILE
it is always necessary to define a file before digitizing data. To do this, choose ’define file’ while in the Digitize Menu. This file will be one of a number of files you will need to create to transfer data into IDRISI, so it will be best to pick a memorable name for the file. The path is set now so that all files are saved to C:\IDRISI\geo270\.
TOSCA now asks you to pick a minimum and maximum X and Y coordinates for the area to be digitized. You can either enter these coordinates as either UTM coordinates or latitudes and longitudes (if you choose lat-long, you need to convert minutes and seconds into degree fractions. Also, it is important to know that West longitudes are entered as negative numbers as are South latitudes; so you would enter 120 52’ 30’ W as -120.833333). After entering all of our coordinates as latitudes and longitudes, we decided that it was actually smarter to define the area in terms of UTM coordinates because, ’If lat/long coordinates are required, it is recommended that the map be digitized in a plane reference system such as UTM coordinates and then be projected into lat\long coordinates. A lat/long projection has a trapezoidal shape which cannot achieve a relationship with the square grid structure of the digitizer (39).’ UTM ticks exist on the USGS quads. These ticks are 1,000 m apart. For this tutorial, select the same reference system you used for your scanned image. It is good to define the X-Y min-max area slightly larger than the area in which your relevant data lie.
TOSCA asks whether you want to have your points entered as integers or real numbers. In our case, we want real numbers, since our lat-long coordinates have numerous decimal places. If you are using a planar (such as UTM) coordinate system with very large numbers, it is fine to choose integer.
CONTROL POINTS:
TOSCA wants to know whether your control point will be entered from the keyboard or from a file. Since you haven’t defined any control files, enter K. Control points are points whose exact coordinates are known-these need not be the points on the map with labeled latitude and longitude - as long as you know the latitude and longitude or Universal Transversal Mercator of a point, it will do. In our case, we only know the coordinates at the lat-long tick marks (but remember, you can extrapolate between tick marks if you are desperate).
You should enter in at least four of these coordinates. TOSCA accepts up to 125 control points, should you ever feel the desire to go crazy with the digitizer. Enter one point using the key-board and then locate that point using the crosshairs of the puck and press the ’0’ key in the upper right hand corner. For future reference, the ’0’ key always enters a point, and the ’7’ key is the finish key. ’3’ is the complete polygon key and ’2’ toggles the digitizer from point into stream mode (more on this later). Enter ’-999’ when you’ve finished digitizing control points. If you ever want to digitize these control points again, or continue work on a file after you have removed the map from its exact position on the tablet, you should save this file as a control point file (TOSCA will ask you if you want to do this).
TOSCA should now display the control points on the screen along with each point’s RMS error (Root Square Mean error), which, according to the manual, ’indicates the amount of error associated with the translation of the map coordinates to the digitizer device coordinates.’ Error is certainly always something to be concerned with, especially if you are digitizing points which lie within a rather large region on a map. For more details, we suggest that you refer again to the Data Quality and the Calculating Allowable RMS sections of the manual. Here is a very short introduction to calculating allowable RMS quoted from the manual, pp. 38: ’According to the 1947 revision of the US National Map Accuracy Standards, maps shall have no more than 10 percent of tested points in error by more than 1/30 of an inch for 1:20,000 scale maps or smaller, and no more than 1/50 inch for maps greater than 1:20,000. Conversion of accuracy standards into statistical analysis of the allowable RMS requires that 90 percent of the accidental errors shall not be larger than 1.64 times the RMS (that is, 1.64 standard deviations, assuming a normal distribution in error). Therefore:
Allowable RMS = [Acceptable error on the ground (or error on the map*scale conversion*unit conversion)] / Z score probability of occurrence (which is 1.64)’
Again, you will not really be able to do this calculation for the tutorial since we have chosen to enter data into a lat-long coordinate system. If the RMS values seem really high (in our case higher than .5 or so), you should re-enter the control points and/or enter more control points.
Now TOSCA asks for two opposite corners of the work-area to be digitized. All this does is set a window which will be visible on the computer screen. If you want to see the points and lines you will be digitizing on the computer screen, you can turn on the nodes by pressing ’N’ when the cursor/mouse arrow is the control box.
DIGITIZING YOUR DATA POINTS:
You should be back in the Digitize menu. If we were more concerned with error in this exercise, we would now want to set the auto snap and the point tolerance. We will just disable these for now by selecting ’Set autosnap’ and then entering ’0’. Also enter ’0’ for point tolerance. For future reference, here is a quote from the manual which explains the function of autosnapping and meaning of point tolerance:
’Snap Tolerance refers to the distance between the node (coordinate location in TOSCA) of a newly digitized feature and the node of an already existing feature within which the two positions are to be considered identical. If the new point falls within the snap tolerance of an existing node, the position of the new point is ’snapped’ to locate it exactly on the existing node. The snap tolerance represents positional error, and should be no larger than the allowable RMS and no smaller than the RMS achieved when setting the control points.
’The point tolerance refers to the distance which must be achieved between a point just digitized and the next point to be digitized before the digitizer will allow the new point to be digitized. If the tolerance is set to 12 meters, for example, TOSCA will not accept a new point until it is more than 12 meters from the previous point. Point tolerance...reflects the sampling rate...(it) should be set at half the snap tolerance (66-67).’
Again, for more information, refer to the manual - either pp. 66 - 67 or the Digitize section (pp. 10).
Now for the fun part! To begin digitizing your points, choose ’digitize’. TOSCA will ask you whether you want to record left-right polygon data, answer no for now. We are not entirely clear what this refers to, though we think it has something to do with entering arcs in sequence to create polygons. TOSCA will now ask you to identify and name the data you want to enter. If you have more than one feature identified as the same thing (for example, if you want to digitize all the wells in your section so you can identify them all together), type -2. Then begin digitizing. To finish digitizing, press the end button (’7’) twice. If you want each feature to have a separate identifier, just type in a number id and digitize the point, line or polygon. If you are digitizing points with the same identifier, press ’0’ to enter the point and ’7’ to tell TOSCA that you want to enter another point. To enter lines, just trace over the line with the puck and press ’0’ whenever the line changes directions. To end the line, press ’7’ once. To create a polygon, follow the same directions as with lines, except press ’3’ when you want the polygon to be closed. You should identify different feature types differently; that is, don’t identify points, polygons and lines together. When you are entering a line, it may be better to switch from point to stream mode. Do this by pressing ’1’ on the puck.
If you were not using a lat-long coordinate system, and you wanted to later reference your planar UTM coordinates into a lat-long system or into another file in IDRISI, you would want to create an id and digitize points of known position (control points) on the map. Since we will not resample these data in IDRISI, you don’t need to do this.
Enter ’-1’ when you are finished digitizing. Exit from this menu and save the file by selecting ’save file’ from the Digitize Menu. You should first save the file as a composite file (press ’5’ when it asks you how you want to save the file). IDRISI will only read files containing one type of object type (points, lines, or polygons) so you have to then go back and create separate files for separate feature types. To do this, you have to open up the composite file (select Open from the Files Menu) save the file again, only this time choose a different name (’points’ for example) and save only the points. Re-open the composite file again to save another object type until all of the features have been saved. If you have two different features that are of the same object type (such as roads and rivers) that you want to be able to manipulate separately in IDRISI you will also need to save those as separate files.
BASIC EDITING:
You can use the EDIT FEATURES Menu to do some basic editing before you import your files into IDRISI. If you have some points, lines or polygons you’d like to erase, you can select ’Delete Feature’ and choose the feature you want to get rid of either by its id or by the sequence number assigned to it (to be able to view these identifiers, make sure the nodes are on). If you want to clean up the screen, you can use the Redraw command in the menu-box below the working Menu - just type in ’R’ when you are in this menu. You can zoom in and out using the Zoom command, which is near Redraw. Select this command, and then use the mouse to drag a square around the area to be enlarged. To zoom back out to normal, press both mouse keys at the same time. Once you are done editing, you will have to do the entire saving process again. Read pages 70-71, 20-22 and 14-16 in the manual for more editing fun.
IMPORTING INTO IDRISI:
To bring your digitized data into IDRISI, first copy (using the File Manager is easiest) the digitized files into your working directory, probably something other than the geo270 folder set as the digitizer path. Open up IDRISI for Windows, and choose the map icon on the tool bar. Choose ’Vector’ in the ’Type of File to Display’. Then type in the name of one of the files containing a single feature type. You may have to change the environment to your working directory to get the files to open. Once one file is opened, you can add other feature-types by selecting ’Add layer’. This will show all the digitized layers together. To display them on top of the scanned image, display the image, select Add Layer from the toolbar on the right side of the screen, and add in the digitized vector layers one by one. When we did this, the overlay was a bit skewed and things that were supposed to match up didn’t exactly match up. We think this is because despite the advice of the Tosca manual, we digitized in lat-long, whose coordinates take the form of a trapezoid, instead of UTM, whose coordinates would be a nice rectangular grid.







