Martin Fitzpatrick

Postgraduate Researcher in Metabolomics

Pathomx (née MetaPath) v2.2.0 released

MetaPath is now named Pathomx, reflecting the focus on analysis of multi-omics data within pathway contexts. Pathomx v2.2.0 has also been released for both Windows and MacOS X.

Screenshot

This latest version adds a number of important features over the previous releases:

  • A new name — Pathomx — reflecting pathway-centered analysis of multi-omics data

  • Interactive figures allowing for zooming and panning through displayed datasets.

  • Sidebar panels allowing easier configuration of tool settings with on-the-fly recalculation

  • A selection of new tools including baseline correction and autophasing for NMR spectra

Download it and try it out!

I have been busy adding demos, walkthroughs and sample datasets since the last release. A set of default/demo workflows is also being constructed allowing simpler setup for processing multiple types of data. More information is also available on the website, and latests versions of Pathomx are available from the downloads page. Feedback and bug-reports are always welcome.

Tagged

Transmit extra data with signals in PyQt

Signals are a neat feature of Qt that allow message-passing between different areas of your program. To use a signal you attach a function to be called in the event of the signal firing, usually accepting a small item of data about the signal state.

However, there is a limitation: the signal can only emit the data it was designed to do. So for example, a QAction has a .triggered that fires when that particular action has been activated. Unfortunately the receiving connected function only receives one thing: checked=True or False. In other words, the receiving function has no way of knowing which action triggered it.

This is usually fine. You can tie a particular action to a particular function. However, sometimes you want to trigger multiple actions off the same type of action, and treat them differently. Here’s a neat trick to do just that.

Degrees of separation

Instead of binding the target function to the signal, you can instead bind a wrapper function that accepts the original signal, attaches some more data, then passes it on. The code to do this (using a lambda) would be:

lambda checked: self.onTriggered(obj, checked)

Here we take the checked signal, add the object it’s come from, then pass it onto the handler. All we need to do is set the object correctly when building the connect:

action = QAction()
action.triggered.connect( lambda checked: self.onTriggered(action, checked) )

Now the onTriggered handler can receive the calling action along with the check state when it’s triggered.

But wait!

Unfortunately things aren’t always that simple. If you try and build multiple actions like this by looping over a set of objects you’ll get hit by namespace problems. All the lambdas will be evaluated in the state of the loop at the end and so clicking any of them will result in the same trigger. The solution is to wrap the lambda function in a creator.

def make_callback(i):
    return lambda n: self.onAddView(n,i)

Here’s an example of me doing exactly that to handle outputting a list of QAction labels into a QMenu for the visual editor in MetaPath

for wid in range( self.app.views.count() ):
    if self.app.views.widget(wid).is_floatable_view:
        def make_callback(i):
            return lambda n: self.onAddView(n,i)

        ve = QAction(self.app.views.tabText(wid), o)
        ve.triggered.connect( make_callback(wid) )
        vmenu.addAction(ve)
Tagged

MetaPath v2: Visual analysis for metabolomics [Released]

MetaPath v2.0.0 has been released today!

Screenshot

This latest version features a new visual editor for construction fo analysis workflows, new analysis plugins, graphing powered by Matplotlib and all sorts of other goodness. Downloads are available for Windows and MacOS X.

For more information see the MetaPath website!

Tagged

MetaPath v1.0.0 released

MetaPath v1.0.0 has been released today for both Windows and MacOS X.

In addition, MetaPath now has its own website for updates, demos, plugins and more.

This latest version of MetaPath adds a number of important features over the previous v0.9 release:

  • Multi-threaded processing, allows multiple analysis steps to be performed in parallel.
  • Online plugin repository, with built-in installation and update system (more plugins coming soon).
  • Windows x64 build (Windows 7, Windows 8 compatible) with Linux and 32bit Windows to follow

You can download it and try it out.

We will be adding a number of demos, walkthroughs and sample datasets in the coming weeks. Latests versions of MetaPath are always available from the downloads page. Feedback and bug-reports are always welcome.

Tagged

MetaPath v0.9.9-beta released

An up-to-date build of MetaPath for Mac OS X is available for download today. This is the first public build supporting the interactive workflow, automated processing, NMR spectra processing, gene expression analysis, PCA, PLS-DA, and the rest.

Builds supporting Windows/Linux are on their way following migration to Python 3 for PyQt5 support on those platforms.

Feedback and bug-reports are more than welcome.

Tagged

NCBI Gene Expression Omnibus (GEO) support added to MetaPath

The NCBI Gene Expression Omnibus (GEO) is ‘is a public functional genomics data repository supporting MIAME-compliant data submissions.’ In other words, its a online database of freely available experimental gene-expression data. Quite useful.

To make this resource available to users of MetaPath I’ve today released a simple GEO-data import plugin. It currently supports SOFT-formatted dataset files. Support for family files is already implemented, but detection of what is what is on the todo list for now.

For the time being, here are a few screenshots/figures generated from this GEO dataset and analysis in action:

software/metapath/geo-demo-import.png

software/metapath/geo-demo-data.png

software/metapath/geo-demo-workflow.png

software/metapath/geo-demo-pca.png

Start to finish (without a prepared workflow): 18seconds.

The GEO import plugin is included in the default MetaPath distribution for download.

Tagged

MetaPath: Example Analysis

Short demo of an experimental analysis of metabolomic (NMR) data using MetaPath. Metabolomic test dataset produced from THP-1 cells grown under normal and hypoxic conditions. Spectra (2D 1H JRES) have been pre-processed and quantified using the BML-NMR service.

The video shows an example analysis from processed data through to metabolic pathways and PLS-DA outputs.

MetaPath is an open-source workflow-based scientific analysis application. Source code is available here with packaged binaries and paper to follow shortly. More information is available on this site.

Tagged

Icoshift on Python

Icoshift is a Matlab-based algorithm for the alignment of NMR spectra developed by Francesco Savorani and Giorgio Tomasi. It performs correlation shifting of spectral intervals using an FFT engine that aligns all spectra simultaneously. I’ve personally found it incredibly useful in the processing of data, particularly through Metabolab.

While extending the NMR spectra processing capabilities of MetaPath it became obvious that spectral alignment would be essential in the toolkit - and Icoshift was the obvious choice. While a Matlab bridge is in process, I wanted to see if it was possible to re-code the Icoshift algorith natively in Python - allowing MetaPath users to get access to it without having Matlab installed.

The answer is - yes!

Icoshift in Python

The Icoshift script was converted to Python using a combination of SMOP followed by hand re-coding using test datasets to check output at various steps. The interface remains identical to the Matlab version at present - a more Pythonic interface may be added later, but a directly comparable interface will be maintained. The Python implementation of Icoshift is available from github or PyPi.

To install (assuming you have installed pip):

pip install icoshift

To use from your own script:

import icoshift
xCS,ints,ind,target=icoshift.icoshift('average',test)

Where test is an numpy.array of data - subjects in rows, ppm in columns. The outputs match those in the Matlab script: of most interest is xCS (the shifted spectra).

Here Be Dragons

Conversion from one programming language to another is not straightforward. Particularly problematic here was the different indexing - zero-index vs. one-indexed arrays - in Python vs. Matlab. Simple to fix in situ, but less so when downstream code depends on it, after various matrix transformations.

At present this algorithm handles the basic default settings and no more. Contributions, bugfixes and - most importantly - Pythonification of the code is most welcome. It is duck ugly as it stands.

It is liable to break and give weird results in various corner cases not yet explored: your help is both appreciated and needed! This is intended as a first release to promote further development.

But it works

Here is some sample output (run through MetaPath - yes there is already a plugin) showing the original and shifted data from a sample manually off-shifted dataset.

icoshift/unshifted.png

icoshift/shifted.png

Thanks

Thanks to Francesco Savorani and Giorgio Tomasi for the original clever and well documented algorithm.

Tagged

MetaPath gets flexible: An interactive analysis workflow tool

It’s been a while since I’ve posted an update on MetaPath development, which is finally forming into a solid package ready for publication. The latest version is available here on Github, with binary packages to follow in the near future.

It’s quite a transformation from earlier versions, so I thougth I’d take some time to walk through the new features and ideas, with a few notes on implementation.

Where we were

With the first release of the software, we had a nice package in place offering pathway exploration, some limited pathway analysis and visualisation of experimental data on those pathways. The key feature of the package was the automated layout of metabolic pathways making visualisation of obscure (and compound) pathways relatively simple.

Over time it became apparent that other views of data would be potentially useful. A basic heatmap viewer was implemented, plus support from GPML and WikiPathways. While these worked fine, the interface was becoming slightly illogical - you could load data, design an ‘experiment’ but that would apply only to some visualisations. Similarly, there was no way to support incremental - or conditional - analysis, or processing, or data sets.

Scientific workflows

A scientific workflow system is a specialized form of a workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, in a scientific application - WikiPedia

I’d got interested in scientific workflows having read about existing solutions such as Taverna. However, these existing tools are typically tailored towards fully-automated big data analysis - and require more than a passing familiarity with programming/data formats. I wanted to take some of this power and offer it in a user-friendly application for the local analysis of experimental small data sets.

The result is an application with a workflow-like internal structure, that presents data analysis in a familiar view (similar in structure to GraphPad Prism). Tasks are broken down into distinct phases - import, processing, identification, analysis and visualisation - to mimic the normal phases of experimental data handling, although this order is not fixed. A workflow overview is provided for a conceptual view of what you’ve done, and processing details are available for all resulting images.

Development has particularly focused on metabolomics data, largely since that is the focus of my PhD - however support for integration of other datasets (or multiple metabolomics datasets) was a bonus.

software/metapath/nh-demo-start.png

Data import

Import is supported from a number of sources including BML-NMR, Chenomx, PeakML, CSV, Metabolights and raw NMR spectra (Bruker currently, more to follow). One of the key features of MetaPath is that all data is stored in a regular internal format, meaning that whatever the source, downstream plugins can made use of it without knowing what it is. What data types are acceptable for each processing step is defined using ‘consumers’ on each one (…but this is an internal plugin feature, and users don’t need to know a thing about it!)

Processing

Processing steps are those that transfer the raw data into a form more useable for subsequent analysis. For metabolomics analysis this includes processes such as binning, phase correction, baseline-correction etc.

Currently binning is supported (the others to follow as provided by the NMRGlue library). Importantly, MetaPath supports live binning, whereby you can see the effects of your changes on the spectra (bin size, offset), together with a data-loss visualisation indicating areas of the spectrum that are most affected. The plan is to integrate this data into the data model object, so subsequent analysis can inform you if that whopping significant difference you’re observing is really an artefact of processing.

software/metapath/nh-demo-spectra.png

software/metapath/nh-demo-binning.png

Other basic transformations are available including mean-centering, log transforming, baselining, setting minima both locally and globally.

software/metapath/nh-demo-mean-centre.png

Exclusion and filtering of data is also supported by various characteristics.

Identification

A key step in analysis is relating what you’ve found to other data sources. In order to facilitate this MetaPath supports data unification and entity-mapping for imported data sources. In English this means that it can map your data labels to biological entities.

The most basic implementation of this is using synonym name matching, but other alternatives are available including remote NMR spectra processing via MetaboHunter.

software/metapath/nh-demo-synonyms.png

Analysis

The stats bit”. Analysis plugins currently included include support for principal components analysis (PCA) via SVD, PLS-DA and fold-change calculations. Every other statistical test you can think of will be implemented shortly, and resulting statistical findings (p values etc.) can be passed for subsequent visualisation on graphs. As with everything else this will be a semi-automated feature, set it up and see the results applied to subsequent data exactly the same.

software/metapath/nh-demo-plsda.png

On particularly neat feature is that because identification has already been carried out prior to analysis, the resulting figures can be automatically annotated with entity information. Here for example is the PLS-DA weights plot showing identified metabolites key to the separation.

software/metapath/nh-demo-plsda-weights.png

Metabolites identified in blue have been automatically mapped to internal Biocyc entities, and can be view through the internal database browser.

software/metapath/nh-demo-db-unification.png

Visualisation

Finally. You have your data, you’ve done your processing, identification and analysis, and now you want to make some nice figures for the Nature paper. MetaPath comes with a mix of standard visualisation tools and some more experimental things, in addition the pathway rendering where this all started! Most new visualisations are powered by the d3 Javascript visualisation library, which offers beautiful, scalable and interactive visuals. Below are a few examples of the output for the sample dataset:

software/metapath/nh-demo-bar.png software/metapath/nh-demo-correlation-matrix.png software/metapath/nh-demo-pathway-connects.png software/metapath/nh-demo-metaviz.png.png

The end

The end product of all this is a completed analysis workflow and a workspace of processed data. Perhaps the most neatest aspect of this is that the completed workflow can be simply re-applied to a new dataset without modification. Processing is automatic (although individual steps may be ‘paused’, e.g. when they depend on remote queries) with status flags updating on the current processing progress. Tweaking any step will automatically update all children.

software/metapath/nh-demo-workspace.png

The workflow hierarchy of data is available at any point from the Home tab. It’s currently view-only, but will be extended to allow live editing of data dependencies within the workspace.

software/metapath/nh-demo-workflow.png

What’s next?

The software is now in a bugfixing state with some minor tweaks to usability, options and configurability before publication (hopefully in the coming months) and release of built binaries.

Help is always appreciated - including translations, plugins, or suggestions. I’d also love to hear examples of other people using MetaPath in their analysis. Get in touch!

Tagged

Cytoscape v3.0.2 on Mac OS X

Cytoscape 3.0.2 was released on 1st August, a bugfix release for the 3.x series. However, there is a compatibility issue with the latest Java update from Apple resulting in Cytoscape hanging on startup. I’d followed advice found elsewhere to update to the latest Java version with no success, until I found this post from Tim Hull, outlining a simple way to check what version you have installed.

/usr/libexec/java_home -v 1.6 -exec java -version

If the output contains “xM4508”, you have the broken version. If it contains “xM4509”, this is probably a different issue…

If you find that you have the bad version, you can download the updated from Apple using the links below:

The updated Java fixed the problem for me.

Tagged