This week I would like to introduce you to Hendrik Luuk, co-founder and CTO of Xpressomics. He kindly agreed to answer some questions for our post blog series and here it is – first hand information on Xpressomics. Keep reading to find out more about this company and their brand new gene expression search engine.
Hendrik Luuk, CTO and Co-founder
1. Could you please give us a short introduction of Xpressomics (goals, interests, mission)?
The company was established in order to make gene expression analytics accessible to the majority of life scientists. Our goal is to re-analyze and index tens of thousands of publicly available datasets and offer a search engine to navigate the results. Our interest really is to enable scientists to reinterpret their results in the light of all other experiments ever made. So far you essentially had to perform text search of indexed pdf-s to find what had been published about your gene of interest. With the gene expression search engine, you can easily query one or more genes to identify experimental conditions where they are differentially expressed. It is about connecting the dots between unrelated data sets to gain insight on gene function and regulation.
2. What is your role in the company and how does your professional background fits it?
My background is in molecular biology, behavioural neuroscience and bioinformatics. I have years of wet-lab experience ranging from cloning gene targeting constructs and neuroanatomical mapping of proteins to the analysis of circadian behavior. My role in Xpressomics is the development of the analytics backend and the core of the search engine.
3. What makes Xpressomics unique?
It’s about the diverse skillset of the team and a clear vision. Our team has deep skills in experimental biology, bioinformatics, cloud computing, infrastructure development as well as business development and sales. We are moving fast and focusing on the problems that we see as major bottlenecks to new discoveries. One of which is the interpretation of the vast amount of data generated by high-throughput technologies. As the data is usually archived in the raw form, it needs to be processed using appropriate statistical methods, indexed and served via an easy-to-use interface to reach everyone potentially interested.
4. You have just launched a new product – your Gene Expression Search Engine. Could you please explain how it works?
You can enter one or more gene symbols to identify experiments where your genes of interest are differentially co-expressed. If you enter a longer list of genes it will tell you which experiments produced similar results to your query. We annotate experimental factors in thousands of experiments and run differential expression analysis between appropriate conditions to identify, which genes are significantly affected by the treatment. As a result we get a huge amount of differential expression profiles relating to a wide range of experimental manipulations, which we then index for easy access to researchers.
Learning which experimental treatments are known to produce similar (or diametrically opposite) responses will aid in gaining insights into gene function and regulation. The tool will allow to navigate the results from thousands of individual experiments much faster and in far more detail than would be possible by studying article reprints or supplementary data files. It will also be useful for machine-aided curation of regulons. So much like Google is indexing the web, we are indexing and curating the publicly available gene expression data. Check out the tool at www.xpressomics.com
5. What do you feel are the advantages that your new tool gives the customers?
The differential expression analysis has been performed using a sensitive probe-level algorithm (DEMI), which is better at extracting value from datasets with a small number of replicates without compromising specificity. It is quite surprising how many public data sets have only one or two replicates per treatment. DEMI is just about the only method, which has been shown to extract meaningful results from very small data sets.
The major advantage of the search engine is its ease of use. Simply by entering a gene symbol you will get a bird’s-eye view of pre-analyzed experiments, sorted by relevance. We see this as an indispensable tool for reinterpreting your results in the light of existing data, yet one that is super simple to use.
6. Do you recognise the problem of limited sharing of genomic data for research and diagnosis? Can you think of an example of how the work of Xpressomics supports data access and knowledge sharing within the genomic community?
Data sharing is perhaps the biggest problem the life sciences community is facing. In our view, every piece of experimental data should be publicly available in its original form to support aggregation, meta-analysis and querying. Just think how many discoveries are delayed by us simply not being aware of experiments, which have, perhaps unintentionally, addressed the question already. It’s an enormous waste of resources and time. The same applies to diagnostic information. Xpressomics opens up the results from archived expression data to public scrutiny. We expect it to be of tremendous value as we analyze the publicly available gene expression data and make it available for all to use in their research activities.
Are you part of a project that facilitates data sharing for genomics research? Would you like to be featured on our blog? We would love to hear from you. Drop us an email or use our contact page to get in touch.