Automated classification/ How could a centralized reliable infrastructure for fungal metagenomics be constructed?.


Organizers: Andrea Porras-Alfaro, Conrad Schoch, Urmas Kõljalg


Urmas Kõljalg provided an update on the new pipeline for fungal analysis in the UNITE website.


Conrad Schoch gave us an update on GenBank effort to create a Reference Fungal Database. He is currently working on SSU. These could be highligted in BLAST searches to improve confidence in taxon identification. Currently a list of 200 taxa across the fungal kingdom are slated for initial inclusion – mainly derived from the AFToL (Assembling the Fungal Tree of Life) project. These should be expanded with the help of third parties including UNITE and the Bayesian classifier. GenBank is also ready and willing to link out to relevant third part databases. One example will be to provide links to unite for specific accessions which have additional annotation data in UNITE.


Porras-Alfaro gave an update on the current status of the Fungal Classifier developed by Kuan-Liang Liu, Gary Xie, Cheryl Kuske (Los Alamos National Laboratory) and Andrea Porras-Alfaro (Western Illinois University). This program will facilitate automated classification of fungal sequences. We are currently creating curated databases for nr ITS rDNA and LSU rDNA. If you would like your sequences from NCBI to be considered as reference sequences in the database that can be used for taxonomic classification of environmental samples, please contact Andrea Porras-Alfaro at a-porras-alfaro AT wiu.edu. We are looking for bar-coding sequences or well curated sequences and we need your participation/input to make this tool an useful database for the scientific community.


Andrea Porras-Alfaro, Conrad Schoch, Urmas Kõljalg are currently discussing potential ways to coordinate efforts between GenBank Reference Database, UNITE pipelines for analysis, the Fungal Automated Classifier and Fungal Barcoding groups.



Open questions that need further discussion:

1. What should we do with unculturable fungi?,

2. Who can we accomplish/motivate faster annotations of curated databases?,

3. What type of outputs do we need to facilitate program compatibility (for example: MOTHUR, MEGAN, UniFrac, UNITE, Lee Taylor’s Lab, etc),

4. Which group are our priorities for barcoding sequencing (i.e. which groups are poorly represented in GenBank?)

5. Should reference sequences in public databases be available after a period of time?. What is the best way to give credit for the use of unpublished sequences?

6. We need to develop guidelines to deposit metadata in GenBank. What is the minimum information that a sequence should have? Should Mycology journals start enforcing these minimum requirements?

7. What should we do with anamorphs in terms of annotation and taxonomic classification?