DATA MINING, BIOSTATISTICS

To make sense of vast quantities of biological data generated by high-throughput experiments, we apply various machine learning techniques to extract complex multidimensional dependencies. Such methods allow us to select important features, compare multidimensional objects, find similarity clusters or measure the strength of linear or non-linear relations. The same tools are used to develop useful predictors of various biological properties working on previously unseen data and providing guidelines for planning further experiments. Apart from data analysis, we are also involved in the development of new computational methods which would be able to cope with the challenges posed by biological data (quality of the data, data volume, scarcity of the observations or missing records).