Mining more out of data: Participant experiences from the Genotyping Support Service trial phase


By Humberto Gomez, GCP GSS Coordinator and Laura Ruiz, SP5 Programme Assistant

 During September 24 to 28, 2007, a workshop took place in Zaragoza, Spain to support the analysis and interpretation of data of the Genotyping Support Service. Seven researchers from around the world were invited to attend: Bolivia, Brazil, Colombia, Chile, Philippines, Tanzania, and Nigeria. Fred van Eeuwijk, Hans Jansen and Marcos Malosetti from Wageningen University acted as facilitators. The meeting, which was organised by GCP in collaboration with the Instituto Agronómico Mediterráneao de Zaragoza (IAMZ), achieved successfully its purpose.

The first day participants presented their research objectives, as well as the data available, obtained both from the GSS exercise and from other sources. Also, they were asked to comment on expectations of the workshop. Instructors provided individualised comments, emphasising the need to have the data properly structured and the relevance of properly formulating the research questions before starting with the analysis. The importance of the outline and structure of the statistical models was also highlighted in the group discussion. As a result of the initial presentations, participants were asked to re-define and outline their models of statistical analysis for their projects considering the phenotypic and genotypic information.

Participants worked on setting up their data in the right format. Then, Drs. Janzen and Malosetti showed how to access and install free statistical software, so that they could analyze their data, which had been prepared the day before in spreadsheets.

Emma Sales and Alberto Vilarinhos studied the population structure of their Musa data. Dr. Sales successfully explored the possibility of finding evidence to support a novel botanical classification for some of her Musa accessions. She also found indications that the morphological classifications of some of her germplasm need to be redone. Alberto Vilarinhos looked for the identification of duplicates in his germplasm, and algorithms to find small sets of markers able to discriminate the largest number of accessions. Emmanuel Okogbenin studied his hypothesis of having found a new genetic source of resistance to Cassava Mosaic Disease, he concluded that he has two markers significantly associated with new sources of CMD resistance and was able to determine TMS30555 and NR8083 as the likely sources. Heneriko Kulembeka, together with Cesar Ospina, studied the results of Marker Assisted Breeding data for sets of Tanzanian Cassava germplasm. They were able to identify good combination of parents for CMD resistance and to learn the use of statistical software for that purpose. This team concluded that the 1st year of MAS in the field allowed spotting resistant parents much faster than with conventional approaches. Also, Dr. Kulumbeka realized the need to ensure good layout in future trials for more efficient collection of phenotypic data and for improved statistical analysis. He found supporting evidence to modify their breeding program by making more crosses from parents that combine both CMD and CBSD resistance (egAR30-3 x Namikonga, AR42-4 x Namikonga).

Boris Sagredo was interested in mapping insect resistance traits in potato segregating populations. He found more effective statistical means to handle phenotypic data from experiments of force feeding larvae in potatoes, generated in different assays. He found different responses of insect resistance in the different populations, and was able to map these quantitative responses.

Jorge Rojas analysed the population structure of a Bolivian potato collection, was able to determine the genetic structure of the collection and concluded that the genetic structure did not seem to be associated to geographical distribution. These findings are helping Dr. Rojas to refine his plans to design core collections, which is a major institutional goal. Some accessions did not fall inside their expected species group, so their taxonomic discrimination will be revised. In addition, Jorge located two potential duplicates in their germplasm. He also analysed a data set of groundnut germplasm collected in Bolivia and was able to detect relatively large genetic variation and to determine their genetic structure. Thanks to the genotyping data and its analysis, he expects to be able to offer a more effective support to the groundnut and potato breeding efforts of his institution.

The last day of the workshop, all participants presented an outline of the information and results of the analysis performed during the workshop. They concurred that the course was very fruitful not only to acquire knowledge on how to implement and perform correct statistical analyses but also highlighted the performance of the instructors as remarkable. Most of the participants believe that they will be able to generate peer review publications from these works.

During the evenings, the group visited some historical sites in Zaragoza such as the Roman Theatre ruins found some 40 years ago in the city centre and the Alfajería Palace, an outstanding palace built by the Moors during their occupation of Spain and that has always played important roles in history. The event was closed with a nice dinner, with a typical offer of gastronomy from the Aragon province.