Science With the Virtual Observatory |
IRAF is the Image Reduction and Analysis Facility, a general purpose software system for the reduction and analysis of astronomical data. The core IRAF distribution includes a good selection of tasks for general image processing and graphics plus a large number of programs for the reduction and analysis of O/IR data in the "noao" package. The bulk of IRAF software in the community is in the form of individual tasks written by users and small to rather large external packages developed by groups outside of NOAO which handle data from other observatories (e.g. HST) and wavelength regimes (EUVE for extreme ultra-violet or ROSAT/AXAF for X-ray). These external packages are distributed separately from the main IRAF distribution and can be installed easily either for an entire IRAF site or by individual users.
The NVOSS software distribution contains a setup script that will do a network-based install of IRAF on Mac OS X and Linux systems. This includes the core system plus several external packages (STSDAS/TABLES) and support software (i.e. DS9 display server and XGterm graphics terminal) used in the following examples.
Installing IRAF requires root permissions due to links made into the system by the installer, but it is as simple as
# cd $NVOSS_HOME/iraf
# csh -f ./iraf_setup.csh
IRAF may be setup in any directory on the machine, the use of the NVOSS directory structure is simply a convenience. The setup script may also be executed with the 'sudo' command instead of logging in as the root user (most Mac OS X users will need this option).
NOTE FOR WINDOWS XP USERS: The IRAF release used for NVOSS is available for Windows XP systems running Cygwin (www.cygwin.com). The default Cygwin installation is sufficient for running the system, a C-shell and X11 environment are required and should be part of the base cygwin system.
Once the system is installed you can log into the system using the commands:
% mkiraf # Create the login.cl
% cl # Start IRAF
The MKIRAF command is needed only the first time you start IRAF, it creates a "login.cl" file defining the system environment and a "uparm" directory to used to store modified task parameters. The directory in which you issue this command does not need to be the IRAF software tree, more typically it will be an 'iraf' subdirectory of your normal user login directory. You should always login from a directory containing a login.cl file, or else issue a new MKIRAF to create one, the system will always start otherwise however the default set of packages will not be loaded and certain user-level foreign tasks will not be available. You should not copy the login.cl file created to other directories as environment variables in this file (e.g. the path to your uparm directory) are relative. Instead, you can create a loginuser.cl file in the same directory to hold personal task definition, package loading preferences, or environment definitions such as a logical path to a data or task directory. The last line of a loginuser.cl file must always be the CL statement, keep.
The NVOSS Software Distribution uses the IRAFNET distribution of IRAF because of the needed platform support. This distribution uses the Enhanced CL by default; ECL provides an error-trapping capability, a number of new builtin functions, and tcsh/bash-like features such as history-recall and command-line editing not found in the "traditional" CL environment. Additionally, a number of new VO capabilities are present and will be described below.
The IRAF system includes a complete programming environment for scientific applications; this includes a programmable Command Language (CL) scripting facility, a Fortran/C programming API (IMFORT) to key system interfaces, and the full SPP/VOS programming environment in which the portable IRAF system and all applications are written. Script tasks may utilize host commands through the 'foreign' command interface, making it possible to develop CL script applications that mix IRAF and non-IRAF tasks (written in a variety of other languages) to process data and access services from remote VO providers.
A complete description of these features is not possible in a mini-tutorial such as this (users are referred to the document links below) and so we will focus mainly on the scripting capabilities of the system, and the many analysis and reduction tasks available, for creating science applications.
The mainstay of IRAF use has traditionally been in single-user desktop analysis. However, in the service-oriented approach of the Virtual Observatory one would ideally like the same functionality available to remote applications as a web-service (either as a traditional REST service such as an http/GET CGI script, or now as a SOAP web-service). The IRAF implementation language, resource constraints, and the evolving nature of the VO create a barrier to using a brute force method for deploying individual tasks as services. A more systematic approach is required that limits the amount of new code needed for deploying a single service, and that leverages the capabilities of more modern languages when dealing with newer technologies such as XML and web-service interaction.
To this end, a framework has been developed that is intended to interface to the entire, unmodified, IRAF system using the same text command strings a user would type from their desktop keyboard. An intermediate process accepts these command strings from an open socket connection and passes them to an IRAF session spawned as a child process. Hand-written Java support code provides a lot of the web-service functionality required, while auto-generated Java code (created from XML service-descriptor files and transformed to Java by XSLT stylsheets) creates a web-service front-end for task.
Using this framework, many services may be created easily by writing only the configuration file. In some cases small wrapper scripts are needed on the IRAF side to filter output of existing tasks, trap error conditions, or simply restrict the interface to the task to a subset of parameters. Creating a new science service is very analogous to developing an entirely new task and requires much more development time, however the transition from IRAF task to SOAP web-service remains as simple as it is for a coordinate converter service.
As a traditional desktop environment, IRAF users are accustomed to working with data already on local disk (or on a local cluster of machines). In the VO environment, we must provide facilities for access to remote data in a familiar manner to users. To do this, we make use of the new VO Client interface (to be covered at various times throughout the NVOSS) to build not only traditional compiled SPP-languge tasks, but also new builtin functions to the CL scripting environment where appropriate.
By making the service/data access and VO integration as seamless a possible, the IRAF environment can then be used to its full advantage in building science applications using the many tasks already available in the system. While it is true many languages can easily access VO data/services (as you'll see), these same languages also lack much of the detailed analysis code needed to do science. The VO Client interface helps bridge the language barrier imposed by the languages and technology often used in the VO, integrating this work with the IRAF environment provides a client-side VO environment that shifts the focus for many developers from managing the new VO technologies to doing science with VO data.
Many of the client-side techniques that will be demonstrated here can also be used in other legacy software. This is still very much a work in progress, please feel free to ask questions and make suggestions
| Help/Context Commands | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Upon logging into the system you'll see the 'motd' system banner and a menu of currently loaded packages. To load a new package, simply type its name. Tasks are likewise executed but may contain additional arguments on the command-line to specify the input/output arguments and functional parameters. Required "query" parameters will be prompted for by the system (see below for an explanation of the parameter system) if not specified on the commandline. Optional parameters are are referred to as "hidden" parameters and if not specified will take the task default value, or the value last set if the task was used previously.
The phelp command will paginate output and like help takes as an argument the name of a task or package (in which case a one-line description of all tasks in that package is returned). The references task takes a string argument and returns a list of task descriptions matching that string, effectively allowing a keyword search. Additionally, if you are using XGterm as the terminal window, a GUI help browser is available that combines paging and searching. For example,
cl> help implot dev=gui
cl> refer photometry
apphot - Aperture Photometry Package [digiphot]
ccdtime - CCD photometry exposure time calculator [obsutil]
daophot - DAO Crowded-Field Photometry Package [digiphot]
digiphot - Digital stellar photometry package [noao]
: : : : :
Note that in this example we abbreviate the REFERENCES task name. Both
task names and parameters may be abbreviated, as can the value of a parameter
in some cases. File and image names may not be abbreviated.
CL statements mostly follow C syntax conventions and have the same sorts of conditional and control statements (e.g. "if-else" and "switch-case" statements, "while" and "for" loops, boolean expressions in conditionals, etc). Some differences C programmers may notice however are missing '++' and '--' operators to inc/decrement variables (although "i += 1" is valid syntax), and that semi-colons are not required to terminate statements. Java programmers may notice that string concatenation may be done using a '+' sign although the double-slash '//' is more the more traditional syntax. The hash/crosshatch symbol ('#') is used to start a comment that continues until the end-of-line. Complete details of the language syntax are given in the "User's Introduction to the IRAF Command Language" and "CL Programmer's Manual" (both compressed Postscript files).
A few peculiarities of the CL deserve a bit more detailed discussion:
Similarly, sexigesimal values may be used in the CL directly since they are converted internally to floating-point values (always double-precision)
cl> x = 12:34:56.7
cl> =x
12.582416666667
cl> = (x * 15) # or " =(12:34:56.7 * 15)"
188.73625 # to convert to RA in degrees
cl> x = 188.73625 ; y = -35.8324762
cl> printf ("%H %h\n", x, y) # to print degrees as RA and Dec
12:34:56.7 -35:49:56.9
Note the use of semi-colons to put multiple statements on a single line and the use of sexigesimal values in expressions. Using an equal sign at the CL prompt serves to make the CL a handy calculator or simply to inspect the value of a variable, parameter, or the immediate execution of a CL builtin function.
Also note the special printf formatting chars '%H' and '%h' used in the printf() statement, these are peculiar to the CL and allow the formatting of sexigesimal numbers (as HMS or DMS respectively, a complete description of the formatting codes can be seen with "cl> help printf" but otherwise follow the C conventions).
A "list directed parameter" is specified by prepending an asterisk to a parameter declaration of any type (but typically a string). These are used to read multiple values from a file with each reference returning the subsequent line from the file. For instance,
cl> type testfile
this is line 1
this is line 2
cl> string *ld
cl> ld = "testfile"
cl> = ld
this is line 1
cl> = ld
this is line 2
cl> = ld
EOF
Within a script these are typically used in a while() loop to read lines from a file, e.g.
struct title
real ra, dec
list = "testfile"
while (fscan (list, ra, dec, title) != EOF)
printf ("%H %h %s\n", ra, dec, title)
Note the declaration of title as a struct -- String variables terminate at whitespace while struct variables continue until the end of line. In the above example, multi-word titles will be stored in a single 'title' script variable and need not be quoted.
Presently, general multi-line string values are not allowed in struct operands due to implicit limits in the length of CL string operators.
Below we see how fscan() can be used to read values from a string. In the print() statements that follow we output the values using both the traditional '//' and '+' concatenation operators to construct the string, but when using '+' we need to explicitly convert the type. Note also that unlike the printf() used above, a print() will automatically output a newline after the string.
cl> string test = "word 17 3.14 now is the time"
cl> = fscan (test, s1, i, x, line)
4
cl> print ("s1 = " // s1 // " x = " // x)
s1 = word ; x = 3.14
cl> print ("s1 = " + s1 + " x = " + str(x))
s1 = word ; x = 3.14
Next we'll see an example of the "scan-from-pipe" feature of the CL and how it can be used to capture the output of tasks to local variables rather than using intermediate files. This is most often used for simple output (e.g. the results of the HSELECT task), however many tasks include a verbose parameter to simplify the output.
cl> imstat dev$pix format-
dev$pix 262144 108.3 131.3 -1. 19936.
cl> imstat ("dev$pix", fields="mean,stddev", format-) | scan (x, y)
cl> printf ("%6.2f +/- %6.2f\n", x, y) | scan (line)
cl> = line
108.32 +/- 131.30
The CL itself can be called as a task to interpret a command as with the Unix eval command by constructing a command and passing it to the CL for interpretation:
cl> s1 = "dev$pix" ; s2 = "mean,npix"
cl> printf ("imstat ('%s', fields='%s', format-)\n", s1, s2) | cl()
108.3154 262144
cl> s1 = mktemp ("tmp$tmp")
cl> imhead ("dev$pix", lo+, > s1)
cl> printf ("!grep Overscan %s\n", osfn(s1)) | cl()
BT-FLAG = 'Apr 22 14:11 Overscan correction strip is [515:544,3:510]'
but, be sure you really need to do so, first:
cl> match ("Overscan", s1)
BT-FLAG = 'Apr 22 14:11 Overscan correction strip is [515:544,3:510]'
In the first example we construct a call to the IMSTAT task for execution, but obviously one would simply make the call to an IRAF task directly. The second example is trivial but more realistic -- here we dump the image header to a file (named by the "s1" temp file) and use the '!' CL escape to call the unix grep utility to search for a string. In many (but not all) cases there is an alternative for common utilities like this already in the system package.
| Section | Refers to | Section | Refers to |
|---|---|---|---|
| pix[] | the whole image | pix[i,j] | the pixel value (scalar) at [i,j] |
| pix[*,*] | whole image, two dimensions | pix[-*,*] | flip x-axis |
| pix[*,-*] | flip y-axis | pix[-*,-*] | flip x and y-axis (180o rotate) |
| pix[*,*,b] | band B of three dimensional image | pix[*,*:s] | subsample in y by S |
| pix[*,l] | line L of image | pix[c,*] | column C of image |
| pix[i1:i2,j1:j2] | subraster of image | pix[i1:i2:sx,j1:j2:sy] | subraster with subsampling |
All IRAF programs which operate upon images may be used to operate on the entire image (the default) or any section of the image. A special notation is used to specify image sections. The section notation is appended to the file name of the image, much like an array subscript is appended to an array name in a conventional programming language. If no section is specified, the entire image will be used. Unfortunately image transposition is not supported by the section syntax, but almost any other combination of flips, subsamples, or range notations are supported.
In addition, MEF FITS files use a second set of braces to indicate an extension or to pass image-kernel-specific parameters. Further details can be found in the IRAF FITS Kernel User's Guide.
Task parameters can be either query params that will be prompted for if not supplied on the command line, or hidden params that will use a default value defined in the task or the user-specified value on the command line. Boolean values in IRAF are yes or no values and may be abbreviated on the command line with '+' or '-', i.e. "cl> <task> bpar+" will turn on the parameter 'bpar'. Sexigesimal values are understood by the CL and will be converted to floating point before being passed to the task (but note that RA must still be converted by a factor of 15).
| Essential Parameter Commands | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|||||||||||||||||
Parameters are learned between task calls and if a default value is changed it will be remembered the next time the task is used. Task params may be reset to their defaults using the unlearn command. Similarly, changing the number or name of params requires that you unlearn the task to pick up the change. Editing all of the params for a task is done using the epar command, simply list parameters may be done with the lpar or dpar commands.
The CL itself has parameters, some of which can be used as variables without declaring them explicitly in a script (e.g. 'i' is an int, 's1' is a string,etc). Other parameters control behavior of the CL, to see these use a command such as "cl> lpar cl".
The task command in the CL is used to define a new application for the system. In the case of script tasks we're considering here (all of which should be procedure scripts), the syntax is simply
The ".cl" extension on scripts is required and normally the
name of the task is the name of the file as well. Tasks that have
no parameters (i.e. arguments in the procedure statement) are required to
be declared with a '$' before the task name, for example
|
|
Scripts without parameters are declared without the leading '$' on the task name as in the above example on the left. So-called foreign commands (i.e. host unix commands) obviously have no IRAF parameters and in addition are declared using either the reserved word foreign to indicate these commands will be found by the user's shell environment, or with a leading '$' on the command itself if they are declared as explicit command strings. For example, to declare the host commands wget and sed as well as to create a command called tpipe that executes a method in a Java file we would do something like:
Note that in the tpipe declaration we are able to use shell environment variables, we also use the special string "$*" to pass commandline arguments through to the task at the host level. See the help page for the task statement for additional details.
| ||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||
A new version of the CL, called the VO-CL, has been developed for the NVOSS along with an external package of tasks. Both are built upon the VO Client interface and are still evolving rapidly, but provide a highly function client-side interface to the VO.
New builtin CL functions are summarized to the right and provide for Registry and data access from the scripting environment. The current complement of tasks in the NVO external package are summarized below and make use of the remaining functionality now found in VO Client to implement both compiled tasks and higher-level script tasks built around the new functions. The choice of whether to create a new builtin or a standard task was made based on the expected use of the feature.
The CL Registry interface provides a basic "resolver" service that can convert some attribute of a Registry resource to some other attribute. Typically we want to use something like a 'ShortName' that is convenient to use in scripts (but isn't necessarily unique) and convert this to a 'ServiceURL' required by the data services. For example:
cl> = regResolver("USNO-B1")
http://chart.stsci.edu/GSCVO/GSC22VO.jsp?
To see whether this was a unique solution we can use the nresolved()
builtin to return the number of matching records in the last resolution
request, and then re-run the request with an 'index' of
-1 to have it list all services found, we'll also print out the 'ServiceType'
and 'Title' to probe a little deeper into what each record is.
cl> =nresolved()
2
cl> print (regResolver("USNO-B1","","ServiceType,Title",-1))
CONE USNO-B1 Catalogue
SKYNODE USNO-B1 SkyNode from VizieR
This type of resolution will become a common feature in calling services and being able to do this from the commandline is a big help when developing scripts that utilize a few select data sources.
Searching the REGISTRY using the keyword or sql interface found on the JHU/STScI web page is equally as easy. For instance:
cl> int res
cl> res = regSearch("cool stars")
cl> res = regSearch("ServiceType like 'skynode'", "cool stars")
The first example does a keyword search for "cool stars", in the second case we limit the search to only SkyNode services. What we get back is a "pointer" we pass to other builtins to further query the results, e.g.
cl> for (i=regResCount(res); i >= 0; i=i-1) {
>>> printf ("Title %d: %s\n", i, regValue(res,"Title",i))
>>> }
| NVO Package Tasks | ||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
The REGISTRY task in the NVO package provides much of the same capability
in the traditional task form. If features an interactive loop making it a
browser for power-users. See the help page for the task for a list of the
Registry resource records that are available and more detail on the current
interface, the examples also provide more information on how this can be used.
Access the data services can be similarly scripted an is outlined in the DAL interface functions above. There are complementary tasks in the NVO package and presently we provide support only for Cone and SIAP services. Like the regSearch() function above, the DAL services return a "pointer" to a result that can be further queried to access the result table as well as the images or data they contain. Combined with the regResolver() function, one is able to invoke a service without know much more the name, for example a CONECALLER commandline might look something like either
cl> conecaller regResolver("USNO-B1","cone") 0.0 0.0 0.05
.....results printed as text table
or
cl> int res, rec
cl> res = dalConeSvc (regResolver("GSC2.2","cone"),0.0,0.0,0.05)
cl> rec = dalGetRecord (res, 0)
cl> printf ("Found %d rows with %d attributes\n", \
>>> dalRecordCount(res), dalAttrCount(rec))
Aside from the VO-specific methods, the getData() builtin can be used to access any URL from an IRAF task. It returns a string containing the filename (which may optionally be specified as a second argument, or else a default file in your uparm will be created) and since it is a function and not a task it may be used as an argument. For example
cl> type getData("http://iraf.net")
Functions can be used as arguments to other functions as well. For example, displaying an image of some object can be done in a little as two commands:
sesame ("ngc188")
display (dalGetData (\
dalSiapSvc (\
regResolver("dss2r","sia"), sesame.ra, sesame.dec, 0.25), \
0, "foo.fits") // "[0]", 1)
In this example we use the SESAME task to resolve an object name to coordinates (the task updates its parameters with the result). We call the DISPLAY task by using the dalGetData() function to dowload the image, using the pointer returned by the dalSiapSvc() which requires a URL supplied by the regResolver() builtin. In this (extreme) case, we need to know that images return by DSS are MEF files and so we append the extension number, but the example shows how powerful these functions can be in accessing data dynamically.
The following example implements a simple SIA table browser that allows the user to automatically download and examine the images found for a given object on a named SIA service. Using the new VO builtin functions and VO Client interface, we are able to concentrate on the functionality and hide some of the underlying technology. We first return the query result and then iterate over the resulting image references, using both foreign commands and native IRAF applications.
To run this example, cut-n-paste the code below and save to a file called siabrowser.cl in your IRAF login directory. Alternatively, the script may be downloaded from http://iraf.net/ftp/nvoss/siabrowser.cl.
Next, log into IRAF and declare the task as:
The use of "home$" illustrates the used if IRAF logical pathnames, where "home" is defined in your login.cl file as the iraf login directory.
Next, start the DS9 display server and execute the task using the name of your favorite object:
Note that we use the '!' escape to start ds9 as a host command in the background before running the task.
|
1 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 |
# SIABROWSER -- Given an object name, resolve its position and query an SIA
# server for images of a given size around that object. Loop through the
# returned image table and download the FITS files, optionally displaying or
# analyzing them.
procedure siabrowser (object, size)
string object { prompt = "Object name" }
real size { prompt = "Size of search" }
string service = "DSS" { prompt = "SIA Service to query" }
begin
string obj, svc, url, ch, imname, res
real sz
int siap, record, nrows, ncols, imnum = 0
# Get params to local variables. Note we assume the NVO package
# is loaded so we won't explicitly check for it here.
obj = object
sz = size
svc = service
reset imclobber = yes # set environment
# Resolve to a position.
sesame (obj)
if (sesame.status < 0) {
error ("Cannot resolve coordinates for '" // obj // "'")
} else {
printf ("Object '%s' resolved to coords '%s'\n", obj, sesame.pos)
}
# Query the SIA service, resolve it on the fly.
print ("Querying server ....")
siap = dalSiapSvc (regResolver (svc,"siap"), sesame.ra, sesame.dec, sz, sz)
# Print some stats about the table.
nrows = dalRecordCount (siap) # get number of images found
record = dalGetRecord (siap, imnum) # get first record
ncols = dalAttrCount (record) # count columns
printf ("\nFound : %d rows with %d attributes\n", nrows, ncols)
# Loop over the image list, downloading the file and displaying it.
for (i=0; i < nrows; i=i+1) {
# Download the image in the record. We assume we get a plain FITS
# file from the service, note however that some services return MEF
# or compressed data.
imname = "image0" // imnum // ".fits"
print ("Downloading image number " // imnum // " ....")
url = dalGetStr (siap, "AccessReference", imnum)
res = getData (url, imname)
# Cheat: Default DSS service returns an MEF, tweak the image name
# to access the image extension.
imname = res // "[0]"
for (ch = "d"; ch != "n"; ch = cl.ukey) {
print ("")
switch (ch) {
case "?": print ("help for commands here...")
case "h": imheader (imname, long+)
case "e": imexamine (imname, 1)
case "d": display (imname, 1, fill+)
case "n": break
case "q": goto cleanup
case "s": imstat (imname)
}
printf ("Command? ")
}
imnum = imnum + 1
}
cleanup:
;
end
|