A rather dull worKLOG. This is just a scratchpad for solutions to IT problems that might be useful to someone else. Expect no opinions, no brilliant insights and definitely no pictures of pets or children. Expect stack traces, code snippets and other hints for the Google Indexer.

Friday, September 29, 2006

System Tests for AG@ROE

One of the biggest things we have to address if we are to persuade astros that the VO is a serious project is reliability. Currently too many services fall over or don't meet interoperability standards fully.
At Edinburgh we are hosting the full range of VO Services from datasets such as the WFCAM Science Archive to applications such as the Anomaly Detector and STILTS to infrastructure such as MySpace. We have to make sure that we know that systems have fallen over before our customers do.
We are going to need a multilayered solution, from simple-minded pinging of webservices such as that provided by MARS (currently down, awaiting a java upgrade) to the heartBeat service in the VOSupportInterfaces currently being developed by the IVOA GWS group. Going to greater depth, I think we need automated system tests of all of our user-facing components, testing them in the same way as a user would. These system tests would be run (say) once a day, rather than providing the real-time feedback offered by the other techniques. For example, MySpace should be tested by creating folders and files, storing and retrieving data. A dataset should be tested by running a standard conesearch etc.

A convenient way to run these tests is via the Astro Server Runtime developed by Noel Winstanley. This offers an easy API to all the VO services we deploy here at Edinburgh and allows me to write the tests in my language-of-choice. From my previous experience of running the Integration&Release process, I've decided to go with JUnit tests wrapped in a Maven project run by a cron job. This is because JUnit tests are easy to write, and Maven provides easy facilities to publish reports to the web on the success rate of the tests.

Status of the System Tests project:
  • Tests for Community and MySpace written
  • Tests for conesearches fleshed out
  • Code has been put into cvs under: http://www.astrogrid.org/viewcvs/test/AgAtRoe/
  • Bugs in the Workbench and Astro Server Runtime have been identified that block further development
  • Need to restructure build to use the full, un-Proguarded ASR.

Tuesday, September 26, 2006

(Two) Weekly Summary 27/9/06

  • One Week at IVOA Moscow (Notes)
  • Released a draft of the Plastic spec for the IVOA note, which received quite a thorough going over from MT.
  • Started work on System tests for AG@ROE

Anomaly Detector problem

There seems to be a problem running the Anomaly Detector algorithm at
http://eurovotech.org/twiki/bin/view/VOTech/AnomalyDetection

This algorithm is currently in the ROE Test registry, not in the main AstroGrid registry. Unfortunately the new workbench seems to be disregarding registry settings and using Leicester, regardless. This might be linked to the change over to use the streaming registry client.

Useful page:
Workbench configuration options.

(Problem now fixed in HEAD)

Wednesday, September 13, 2006

(Two) Weekly Summary 13/9/06

  • In Strasbourg at VOTech meeting.
    • Progress on messages for clustering and spectra
    • approval from MT and TB for IVOA note
    • Interesting tool from Bruno...might help with SIAP service for SSA
    • Information from Guy on secure SIAP - seems that TLS is the way forward, provided we can find a container that can deal with certificates.
  • I've asked Pierre Fernique (deputising for Mark Allen) for a slot at the IVOA meeting - it seems they're not inundated with speakers at the apps IG.
  • Marked MSc dissertation.



Actions Items:
--------------

(68.2) JDT to give an ROE coffee talk with a demo of AstroGrid
- maybe also including access to UKIDSS data.
[open: decided to postpone this until after registration of users]

(69.2) JDT to pass this up to Noel Winstanley and/or Kevin Benson.
[closed: see AG@ROE below]
Follow up: JDT to fix registry entries

(70.4) JDT to contact Guy Rixon to check that their plans for wrapping
STIL(TS) as CEA services are consistent.
[closed: JDT decided that the best thing to do would be to deploy here
a clone of the Cambridge set-up. STILTS can auto-generate the required
application description, so that it is possible to have the description
for the two installations being identical, which would enable JES to
pick on or other of them.]

Followup: JDT has now cloned the STILTS installation at the ROE. This is using STILTS' auto-generated app-description. There are a couple of minor issues: some of the description metadata is html-formatted, and doesn't play nicely with the Workbench, and one of the parameter types has been incorrectly specified. [the constant specifying the STILTS command]. One of the disadvantages of cloning the Cambridge installation is that we in Edinburgh don't get any credit - it's got IoA logos all over it.

(70.5) JDT to contact Phil Nicholson to ask about his plans for
a standard SIAP component.
[closed: see AG@ROE below]

Follow up: see JDT comments above.

AOB
---

IBM hardware
JDT mentioned that there had been an email circulated about the possibility of
some surplus IBM hardware becoming available on site soon.
Any news?

Tuesday, August 29, 2006

Thunderbird mail forwarding from a filter broken

I want to use a spare gmail account to backup all my incoming and outgoing mail in Thunderbird. Outgoing is easy enough - for each account under tools/accounts/Copies & Folders you add a line to the Bcc section. To set up backup of incoming mail, I created a filter to act on the inbox and find "read" messages, and forward them. This ought to work but doesn't....I just get an alert popup box with "sending of messages failed." It seems I'm not the only one.

Java signed applet security insanity

I've been looking into using signed applets as a javascript-plastic bridge. The applets have to be signed so that they can make RMI connections to the Plastic Hub - this works well. Then through the wonders of LiveConnect you can call methods on the applet from JavaScript, and thanks to a magical object called JSObject you can make callbacks back to the JavaScript from the Java. This also, mostly, works well. However, try deploying the applet anywhere but localhost, and some of the JavaScript->Java calls fail with security exceptions (see below). Not all, just some. The reason? Apparently, if the JavaScript itself isn't signed, then any method calls originating in the JavaScript are treated as if the applet itself isn't signed - you lose all your signed applet privs. What kind of deranged madman thought that up? It's useless as a security benefit - all you do is put the controversial stuff into a separate thread, so that the JavaScript to Java calls merely set and get properties...so it's trivially subverted. Instead it just makes your code more complicated than it need be, and confounds the poor programmer with mysterious errors.

Implementation note: it's not good enough just to spawn an new thread in the method called by JavaScript, as this will inherit the security of the calling thread. Instead you need to instantiate a worker thread in (say) the start() method of the applet, and have the JavaScript-called method hand tasks to this worker. You can see a (rather naive) implementation here.

-----------
An alternative solution:
Put the "untrusted code" in:
  somemethod() {
...normal code here...
String user = (String) AccessController.doPrivileged(
new PrivilegedAction() {
public Object run() {
return System.getProperty("user.name");
}
}
);




Stacktrace for google's benefit:
java.security.PrivilegedActionException: java.lang.reflect.InvocationTargetException
at java.security.AccessController.doPrivileged(Native Method)
at sun.plugin.liveconnect.SecureInvocation$2.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.plugin.liveconnect.SecureInvocation.CallMethod(Unknown Source)
at sun.plugin.liveconnect.SecureInvocation.access$300(Unknown Source)
at sun.plugin.liveconnect.SecureInvocation$CallMethodThread.run(Unknown Source)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.plugin.javascript.invoke.JSInvoke.invoke(Unknown Source)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.plugin.javascript.JSClassLoader.invoke(Unknown Source)
at sun.plugin.liveconnect.PrivilegedCallMethodAction.run(Unknown Source)
... 6 more
Caused by: java.security.AccessControlException: access denied (java.net.SocketPermission 10.0.0.8:2234 connect,resolve)
at java.security.AccessControlContext.checkPermission(Unknown Source)
at java.security.AccessController.checkPermission(Unknown Source)
at java.lang.SecurityManager.checkPermission(Unknown Source)
at java.lang.SecurityManager.checkConnect(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.checkConnectPermission(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at net.ladypleaser.rmilite.impl.RemoteInvocationHandlerImpl_Stub.invoke(Unknown Source)
at net.ladypleaser.rmilite.impl.LocalInvocationHandlerImpl.invokeRemote(LocalInvocationHandlerImpl.java:53)
at net.ladypleaser.rmilite.impl.LocalInvocationHandlerImpl.invoke(LocalInvocationHandlerImpl.java:41)
at $Proxy0.requestToSubset(Unknown Source)
at org.votech.plastic.managers.PlasticApplication.sendMessage(PlasticApplication.java:166)
at org.votech.plastic.managers.PlasticApplication.sendMessage(PlasticApplication.java:140)
at uk.ac.roe.PlasticHubApplet.sendLoadVOTable(Unknown Source)
... 16 more

Wednesday, August 23, 2006

How to get an applet to repaint

You have an applet. In response to some event you want to change all the components it contains. However, your changes don't appear...why?
If I was using a JFrame, I'd liberally throw in a few pack() statements, which causes the JFrame to resize itself based on the new components, and redraw, but you don't have this option with an applet. After some javadocing, it seems that a combination of validate(), invalidate() and repaint() ought to do it. Some experiments later, it seems that
  • invalidate does nothing
  • validate shows up the new components, but over the old ones
  • repaint erases everything
  • invalidate followed by validate is the same as validate
  • invalidate followed by repaint is the same as repaint

  • validate followed by repaint does the trick

Copying a list of files in Ant

One might think that such an everyday task as copying a list of files would be trivial in Ant. Not so - it's a bloody nightmare.
  1. First attempt: creating a FileList would seem to be what you want. Nope - the Copy task doesn't support FileLists. Why?????
  2. Second attempt: create a FileSet (with the usual substitution of angle brackets....would it be too much to ask to have a blog site that doesn't stuff up xml?):


  3. [copy todir="foo"]
    [fileset dir="bar"]
    [filename name="pathfrombarto/file"/]
    [filename name="differentpathfrombarto/file2"/]
    [/fileset]
    [/copy]

    Nope - the filenames are mutually exclusive - the fileset does an "and" on them, and since your files won't match both names, it copies nothing. Ridiculous.
In any case, the fact that you're forced to use a FileSet and its pattern matching means that if you have a large set of files below "dir" (e.g. a Maven repository) then the above takes an age.

The solution, incidentally, is:

[copy todir="foo"]
[fileset dir="bar"]
[or]
[filename name="pathfrombarto/file"/]
[filename name="differentpathfrombarto/file2"/]
[/or]
[/fileset]
[/copy]

Though it still takes an age to process.

Monday, August 21, 2006

Weekly Summary 30/8/06

  • Guy has created a new community with password change facility - passed I&R and ready to install.
    • Tried to install, but it won't self-register: see bug 1789
    • Guy has put a fix on the merge queue for this...in the meantime we can hack the reg entry....done...appears to work well. Except.....you can change your password with any old password.
    • Mark to reinstall when it passes I&R? How did this error slip through I&R in the first place?
  • Submitted abstract to ADASS 2006 http://www.adass.org:8080/Conferences/2006/Venue/people/participants/abstract?abstract_id=82
  • Planning: using Open Workbench http://www.openworkbench.org/ to create GANTT charts and managed dependencies between tasks.
  • Downloaded and installed Eirik http://wiki.eurovotech.org/twiki/bin/view/VOTech/EirikDemo on a Debian machine. Plastic and AR functionality appear fine. Couldn’t test further due to Window size issues. Installation straightforward - install debian version of R first, then hand edit the Eirik startup script to find the location or libR.so.
    • Tried to install Eirik on Shuna (Bob’s machine). Since I’m not root this involved building R from source using ./config and make. This is fairly easy to do, following the instructions, but care must be taken in the config stage to request a shared library libR.so to be built, and also to set –with-readline=no on Shuna. Unfortunately getting Eirik to run on Shuna wasn’t do-able due to assorted missing shared libs on this machine.
    • However, following several iterations, did get a windows version running.
  • Hardened the hub to concurrency problems and clients passing shit values.
  • Jeff and others are experiencing some AR problems - AR is refusing RMI connections. I also have the same problem from time to time and it’s usually cured with a reboot. Mark’s Plaskit hub doesn’t suffer from this. Suspect it’s something to do with the way the AR deals with IP addresses - a network outtage is enough to screw it up.
  • Finished Anomaly Detection pipeline
    • Looking at deploying stilts at the same time. There are problems with the existings STILTS deployments discussed here. (Fixed by Kona). Get Kona/Mark to install?
    • Standard and Advanced interfaces now done, and registered. All works smoothly, but not very robust to errors in data since fastem locks up waiting for a user response....not a great idea for a CEA app.
    • Created an applet to allow users to simply change their workbench registry settings. This was also a learning exercise for creating a plastic-applet.
    • Instructions for use are here.
  • Finished a prototype Plastic-compatible web page...this is aimed at our own WSA, but also to get DataScope on board.
  • Received request from Noel to smarten up our metadata - we need to discuss logos.
  • Checked Son Of GAIA for Nigel, and found that I can’t run the webstart version either due to missing classes. Probably due to the 1.5 java libraries having renamed some classes. Reported to ussc@star.rl.ac.uk...address is defunct...emailed Mark Taylor...apparently you need to install the JAI package first (not documented!).
  • Assisted Brian getting his weka cea apps deployed to our test registry. We encountered a lot of out-of-memory errors in the registry, and also a particularly annoying bug necessitating the reconfiguration of the registry and changing its authority Id.
  • Assisted Brian changing the registry his workbench uses.
  • Realised that if we have a test registry, we need a test JES, so installed one here. The official release of JES is lacking an astrogrid-security jar, which you have to add by hand to WEB-INF/lib. Catherine has now released a new build, which Mark should test.
  • Thanks to Brian, reproduced the mysterious "Found Resource Document, but had no AccessURL" that has been plaguing systems that harvest from our publishing registry and try to access our date sets. Bugzilla report. It appears to be due to our use of Java 1.5. I've found that most of our old datasets suffer from the problem, but that it doesn't occur in the latest snapshot build of the workbench. Hence we could ignore it - it will go away when we reregister our new services anyway. Left it in Mark and Kevin's capable hands.
  • Republished Plastic site with google analytics code throughout
    • mvn 2.0.5 not yet released, and the bugs in 2.0.4 make site publication a painful process as
      • site:deploy doesn't work in windows
      • site:stage doesn't copy the contents of directories, effectively making multiproject site deployment useless



Actions
(68.2) JDT to give an ROE coffee talk with a demo of AstroGrid
- maybe also including access to UKIDSS data.
pending...


(69.1) JDT and KEA to produce a plan for providing password-changing
functionality for ROE account holders registering for AG.
DONE - see above

(69.2) JDT to pass this up to Noel Winstanley and/or Kevin Benson. DONE - see above

ACTION (70.1) MSH to install the new version of Community.
done by JDT

ACTION: (70.4) JDT to contact Guy Rixon to check that their plans for wrapping
STIL(TS) as CEA services are consistent.
DONE - see above

Thoughts:
Need better cover for MSH's absence

Creating the Anomaly Detector CEC

I'm installing Andy Connolly's Expectation Maximization/Anomaly Detection algorithm into an AstroGrid CEC (Common Execution Connector) to make it available to the VO. Along the way I've been using the splendid STILTS utility to do all the VOTable conversions.
Things I've learned along the way
  • STILTS is sufficient to take a table in assorted different formats, and extract a user-defined set of columns (specified by position or name), and export the results as an EM-friendly space-separated text file. It is also smart enough to take the generated list of row numbers and p_values and do an exact cross match on the original table...thus generating a clone of the original table, sorted by p_value and with the p_values attached as an extra column. In short, it does all the table manipulation needed to transform a VO-standard table into a proprietary format and back again, with only a little help from awk.
  • Any processes started by the CEC must NOT ask the user for input. The current EM-algorithm prompts the user for input if it fails for some reason...this seriously screws up your CEC. The solution is to add a test on the return value from the STILTS preprocessing step, before the EM-algorith is called:
    • #echo $?
      #Try to protect fastem from bad data
      if [ "$?" -ne "0" ]; then
      echo "STILTS failed to prepare data. Aborting"
      exit 1
      fi

  • You can only specify a registry template for a single CEA application. This means that if you're CEC supports several applications, you need to edit the registry description by hand after it's been generated.
  • You can specify an optional argument in the CEC-CL config with (insert)
  • The CEC-CL only supports the following kinds of parameters:
    • -key value
    • positional
    • key=value
  • Unfortunately the em algorithm takes parameters as
    • key value
  • You can get round this by pattern matching on the params with something like ${@//=/ } which takes all the parameters in one go and replaces all = with ' '

Friday, August 18, 2006

New week, new blog

Still looking for the perfect home for my blog. All I want is something that's quick and responsive that I can use to jot down notes from work to give my ageing memory a break. Plus...when you spend all day wrestling maven into submission, it's possible that what you've learned just might save someone else some grey hairs.

So, I'm defecting from wordpress: http://jontayler.wordpress.com/ to here.